Skip to content

Commit ea0ba05

Browse files
2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr (#13763)
* Add model 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr * Add model 2023-04-20-distilbert_base_zero_shot_classifier_uncased_mnli_en * Add model 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_snli_tr * Add model 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr * Update 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr.md * Update 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_snli_tr.md --------- Co-authored-by: ahmedlone127 <[email protected]>
1 parent bb9a155 commit ea0ba05

4 files changed

+115
-20
lines changed

docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr.md

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ title: DistilBERTZero-Shot Classification Base - distilbert_base_zero_shot_class
44
author: John Snow Labs
55
name: distilbert_base_zero_shot_classifier_turkish_cased_allnli
66
date: 2023-04-20
7-
tags: [zero_shot, distilbert, base, tr, turkish, cased, open_source, tensorflow]
7+
tags: [distilbert, zero_shot, turkish, tr, base, open_source, tensorflow]
88
task: Zero-Shot Classification
99
language: tr
1010
edition: Spark NLP 4.4.1
@@ -32,8 +32,8 @@ We used TFDistilBertForSequenceClassification to train this model and used Disti
3232
{:.btn-box}
3333
<button class="button button-orange" disabled>Live Demo</button>
3434
<button class="button button-orange" disabled>Open in Colab</button>
35-
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_allnli_4.4.1_3.2_1681950583033.zip){:.button.button-orange}
36-
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr_4.4.1_3.2_1681950583033.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
35+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr_4.4.1_3.2_1682016415236.zip){:.button.button-orange}
36+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr_4.4.1_3.2_1682016415236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
3737

3838
## How to use
3939

@@ -63,7 +63,6 @@ document_assembler,
6363
tokenizer,
6464
zeroShotClassifier
6565
])
66-
6766
example = spark.createDataFrame([['Senaryo çok saçmaydı, beğendim diyemem.']]).toDF("text")
6867
result = pipeline.fit(example).transform(example)
6968
```
@@ -84,9 +83,7 @@ val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilb
8483
.setCandidateLabels(Array("olumsuz", "olumlu"))
8584

8685
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier))
87-
8886
val example = Seq("Senaryo çok saçmaydı, beğendim diyemem.").toDS.toDF("text")
89-
9087
val result = pipeline.fit(example).transform(example)
9188
```
9289
</div>
@@ -104,4 +101,4 @@ val result = pipeline.fit(example).transform(example)
104101
|Output Labels:|[multi_class]|
105102
|Language:|tr|
106103
|Size:|254.3 MB|
107-
|Case sensitive:|true|
104+
|Case sensitive:|true|

docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr.md

Lines changed: 2 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,8 @@ We used TFDistilBertForSequenceClassification to train this model and used Disti
3232
{:.btn-box}
3333
<button class="button button-orange" disabled>Live Demo</button>
3434
<button class="button button-orange" disabled>Open in Colab</button>
35-
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr_4.4.1_3.2_1681952299918.zip){:.button.button-orange}
36-
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr_4.4.1_3.2_1681952299918.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
35+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr_4.4.1_3.2_1682014879417.zip){:.button.button-orange}
36+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr_4.4.1_3.2_1682014879417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
3737

3838
## How to use
3939

@@ -45,7 +45,6 @@ We used TFDistilBertForSequenceClassification to train this model and used Disti
4545
document_assembler = DocumentAssembler() \
4646
.setInputCol('text') \
4747
.setOutputCol('document')
48-
4948
tokenizer = Tokenizer() \
5049
.setInputCols(['document']) \
5150
.setOutputCol('token')
@@ -63,10 +62,8 @@ document_assembler,
6362
tokenizer,
6463
zeroShotClassifier
6564
])
66-
6765
example = spark.createDataFrame([['Dolar yükselmeye devam ediyor.']]).toDF("text")
6866
result = pipeline.fit(example).transform(example)
69-
7067
```
7168
```scala
7269
val document_assembler = DocumentAssembler()
@@ -85,9 +82,7 @@ val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilb
8582
.setCandidateLabels(Array("ekonomi", "siyaset","spor"))
8683

8784
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier))
88-
8985
val example = Seq("Dolar yükselmeye devam ediyor.").toDS.toDF("text")
90-
9186
val result = pipeline.fit(example).transform(example)
9287
```
9388
</div>

docs/_posts/ahmedlone127/2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_snli_tr.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,8 @@ We used TFDistilBertForSequenceClassification to train this model and used Disti
3232
{:.btn-box}
3333
<button class="button button-orange" disabled>Live Demo</button>
3434
<button class="button button-orange" disabled>Open in Colab</button>
35-
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_snli_tr_4.4.1_3.2_1681951486863.zip){:.button.button-orange}
36-
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_snli_tr_4.4.1_3.2_1681951486863.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
35+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_snli_tr_4.4.1_3.2_1682015986268.zip){:.button.button-orange}
36+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_snli_tr_4.4.1_3.2_1682015986268.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
3737

3838
## How to use
3939

@@ -63,7 +63,6 @@ document_assembler,
6363
tokenizer,
6464
zeroShotClassifier
6565
])
66-
6766
example = spark.createDataFrame([['Senaryo çok saçmaydı, beğendim diyemem.']]).toDF("text")
6867
result = pipeline.fit(example).transform(example)
6968
```
@@ -75,18 +74,17 @@ val document_assembler = DocumentAssembler()
7574
val tokenizer = Tokenizer()
7675
.setInputCols("document")
7776
.setOutputCol("token")
77+
val zeroShotClassifier =
7878

79-
val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_base_zero_shot_classifier_turkish_cased_snli", "en")
79+
DistilBertForZeroShotClassification.pretrained("distilbert_base_zero_shot_classifier_turkish_cased_snli", "en")
8080
.setInputCols("document", "token")
8181
.setOutputCol("class")
8282
.setCaseSensitive(true)
8383
.setMaxSentenceLength(512)
8484
.setCandidateLabels(Array("olumsuz", "olumlu"))
8585

8686
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier))
87-
8887
val example = Seq("Senaryo çok saçmaydı, beğendim diyemem.").toDS.toDF("text")
89-
9088
val result = pipeline.fit(example).transform(example)
9189
```
9290
</div>
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
---
2+
layout: model
3+
title: DistilBERTZero-Shot Classification Base - MNLI(distilbert_base_zero_shot_classifier_uncased_mnli)
4+
author: John Snow Labs
5+
name: distilbert_base_zero_shot_classifier_uncased_mnli
6+
date: 2023-04-20
7+
tags: [zero_shot, en, mnli, distilbert, english, base, open_source, tensorflow]
8+
task: Zero-Shot Classification
9+
language: en
10+
edition: Spark NLP 4.4.1
11+
spark_version: [3.2, 3.0]
12+
supported: true
13+
engine: tensorflow
14+
annotator: DistilBertForZeroShotClassification
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
This model is intended to be used for zero-shot text classification, especially in English. It is fine-tuned on MNLI by using DistilBERT Base Uncased model.
23+
24+
DistilBertForZeroShotClassification using a ModelForSequenceClassification trained on NLI (natural language inference) tasks. Equivalent of DistilBertForSequenceClassification models, but these models don’t require a hardcoded number of potential classes, they can be chosen at runtime. It usually means it’s slower but it is much more flexible.
25+
26+
We used TFDistilBertForSequenceClassification to train this model and used DistilBertForZeroShotClassification annotator in Spark NLP 🚀 for prediction at scale!
27+
28+
## Predicted Entities
29+
30+
31+
32+
{:.btn-box}
33+
<button class="button button-orange" disabled>Live Demo</button>
34+
<button class="button button-orange" disabled>Open in Colab</button>
35+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_uncased_mnli_en_4.4.1_3.2_1682015669457.zip){:.button.button-orange}
36+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_uncased_mnli_en_4.4.1_3.2_1682015669457.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
37+
38+
## How to use
39+
40+
41+
42+
<div class="tabs-box" markdown="1">
43+
{% include programmingLanguageSelectScalaPythonNLU.html %}
44+
```python
45+
document_assembler = DocumentAssembler() \
46+
.setInputCol('text') \
47+
.setOutputCol('document')
48+
49+
tokenizer = Tokenizer() \
50+
.setInputCols(['document']) \
51+
.setOutputCol('token')
52+
53+
zeroShotClassifier = DistilBertForZeroShotClassification \
54+
.pretrained('distilbert_base_zero_shot_classifier_uncased_mnli', 'en') \
55+
.setInputCols(['token', 'document']) \
56+
.setOutputCol('class') \
57+
.setCaseSensitive(True) \
58+
.setMaxSentenceLength(512) \
59+
.setCandidateLabels(["urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology"])
60+
61+
pipeline = Pipeline(stages=[
62+
document_assembler,
63+
tokenizer,
64+
zeroShotClassifier
65+
])
66+
67+
example = spark.createDataFrame([['I have a problem with my iphone that needs to be resolved asap!!']]).toDF("text")
68+
result = pipeline.fit(example).transform(example)
69+
```
70+
```scala
71+
val document_assembler = DocumentAssembler()
72+
.setInputCol("text")
73+
.setOutputCol("document")
74+
75+
val tokenizer = Tokenizer()
76+
.setInputCols("document")
77+
.setOutputCol("token")
78+
79+
val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_base_zero_shot_classifier_uncased_mnli", "en")
80+
.setInputCols("document", "token")
81+
.setOutputCol("class")
82+
.setCaseSensitive(true)
83+
.setMaxSentenceLength(512)
84+
.setCandidateLabels(Array("urgent", "mobile", "travel", "movie", "music", "sport", "weather", "technology"))
85+
86+
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier))
87+
val example = Seq("I have a problem with my iphone that needs to be resolved asap!!").toDS.toDF("text")
88+
val result = pipeline.fit(example).transform(example)
89+
```
90+
</div>
91+
92+
{:.model-param}
93+
## Model Information
94+
95+
{:.table-model}
96+
|---|---|
97+
|Model Name:|distilbert_base_zero_shot_classifier_uncased_mnli|
98+
|Compatibility:|Spark NLP 4.4.1+|
99+
|License:|Open Source|
100+
|Edition:|Official|
101+
|Input Labels:|[token, document]|
102+
|Output Labels:|[multi_class]|
103+
|Language:|en|
104+
|Size:|249.7 MB|
105+
|Case sensitive:|true|

0 commit comments

Comments
 (0)