Skip to content

Commit 5bac010

Browse files
Models hub (#13770)
* 2023-04-20-distilbert_base_uncased_mnli_en (#13761) * Add model 2023-04-20-distilbert_base_uncased_mnli_en * Add model 2023-04-20-distilbert_base_turkish_cased_allnli_tr * Add model 2023-04-20-distilbert_base_turkish_cased_snli_tr * Add model 2023-04-20-distilbert_base_turkish_cased_multinli_tr * Update and rename 2023-04-20-distilbert_base_turkish_cased_allnli_tr.md to 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr.md * Update and rename 2023-04-20-distilbert_base_turkish_cased_multinli_tr.md to 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr.md * Update and rename 2023-04-20-distilbert_base_turkish_cased_snli_tr.md to 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_snli_tr.md * Update and rename 2023-04-20-distilbert_base_uncased_mnli_en.md to distilbert_base_zero_shot_classifier_turkish_cased_snli * Rename distilbert_base_zero_shot_classifier_turkish_cased_snli to distilbert_base_zero_shot_classifier_turkish_cased_snli_en.md * Update 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_snli_tr.md * Update 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr.md * Update 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr.md --------- Co-authored-by: ahmedlone127 <[email protected]> * 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr (#13763) * Add model 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr * Add model 2023-04-20-distilbert_base_zero_shot_classifier_uncased_mnli_en * Add model 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_snli_tr * Add model 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr * Update 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr.md * Update 2023-04-20-distilbert_base_zero_shot_classifier_turkish_cased_snli_tr.md --------- Co-authored-by: ahmedlone127 <[email protected]> --------- Co-authored-by: ahmedlone127 <[email protected]>
1 parent d7f91a4 commit 5bac010

5 files changed

+524
-0
lines changed
Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
---
2+
layout: model
3+
title: DistilBERTZero-Shot Classification Base - distilbert_base_zero_shot_classifier_turkish_cased_allnli
4+
author: John Snow Labs
5+
name: distilbert_base_zero_shot_classifier_turkish_cased_allnli
6+
date: 2023-04-20
7+
tags: [distilbert, zero_shot, turkish, tr, base, open_source, tensorflow]
8+
task: Zero-Shot Classification
9+
language: tr
10+
edition: Spark NLP 4.4.1
11+
spark_version: [3.2, 3.0]
12+
supported: true
13+
engine: tensorflow
14+
annotator: DistilBertForZeroShotClassification
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
This model is intended to be used for zero-shot text classification, especially in Trukish. It is fine-tuned on MNLI by using DistilBERT Base Uncased model.
23+
24+
DistilBertForZeroShotClassification using a ModelForSequenceClassification trained on NLI (natural language inference) tasks. Equivalent of DistilBertForSequenceClassification models, but these models don’t require a hardcoded number of potential classes, they can be chosen at runtime. It usually means it’s slower but it is much more flexible.
25+
26+
We used TFDistilBertForSequenceClassification to train this model and used DistilBertForZeroShotClassification annotator in Spark NLP 🚀 for prediction at scale!
27+
28+
## Predicted Entities
29+
30+
31+
32+
{:.btn-box}
33+
<button class="button button-orange" disabled>Live Demo</button>
34+
<button class="button button-orange" disabled>Open in Colab</button>
35+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr_4.4.1_3.2_1682016415236.zip){:.button.button-orange}
36+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_allnli_tr_4.4.1_3.2_1682016415236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
37+
38+
## How to use
39+
40+
41+
42+
<div class="tabs-box" markdown="1">
43+
{% include programmingLanguageSelectScalaPythonNLU.html %}
44+
```python
45+
document_assembler = DocumentAssembler() \
46+
.setInputCol('text') \
47+
.setOutputCol('document')
48+
49+
tokenizer = Tokenizer() \
50+
.setInputCols(['document']) \
51+
.setOutputCol('token')
52+
53+
zeroShotClassifier = DistilBertForZeroShotClassification \
54+
.pretrained('distilbert_base_zero_shot_classifier_turkish_cased_allnli', 'en') \
55+
.setInputCols(['token', 'document']) \
56+
.setOutputCol('class') \
57+
.setCaseSensitive(True) \
58+
.setMaxSentenceLength(512) \
59+
.setCandidateLabels(["olumsuz", "olumlu"])
60+
61+
pipeline = Pipeline(stages=[
62+
document_assembler,
63+
tokenizer,
64+
zeroShotClassifier
65+
])
66+
example = spark.createDataFrame([['Senaryo çok saçmaydı, beğendim diyemem.']]).toDF("text")
67+
result = pipeline.fit(example).transform(example)
68+
```
69+
```scala
70+
val document_assembler = DocumentAssembler()
71+
.setInputCol("text")
72+
.setOutputCol("document")
73+
74+
val tokenizer = Tokenizer()
75+
.setInputCols("document")
76+
.setOutputCol("token")
77+
78+
val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_base_zero_shot_classifier_turkish_cased_allnli", "en")
79+
.setInputCols("document", "token")
80+
.setOutputCol("class")
81+
.setCaseSensitive(true)
82+
.setMaxSentenceLength(512)
83+
.setCandidateLabels(Array("olumsuz", "olumlu"))
84+
85+
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier))
86+
val example = Seq("Senaryo çok saçmaydı, beğendim diyemem.").toDS.toDF("text")
87+
val result = pipeline.fit(example).transform(example)
88+
```
89+
</div>
90+
91+
{:.model-param}
92+
## Model Information
93+
94+
{:.table-model}
95+
|---|---|
96+
|Model Name:|distilbert_base_zero_shot_classifier_turkish_cased_allnli|
97+
|Compatibility:|Spark NLP 4.4.1+|
98+
|License:|Open Source|
99+
|Edition:|Official|
100+
|Input Labels:|[token, document]|
101+
|Output Labels:|[multi_class]|
102+
|Language:|tr|
103+
|Size:|254.3 MB|
104+
|Case sensitive:|true|
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
---
2+
layout: model
3+
title: DistilBERTZero-Shot Classification Base - distilbert_base_zero_shot_classifier_turkish_cased_multinli
4+
author: John Snow Labs
5+
name: distilbert_base_zero_shot_classifier_turkish_cased_multinli
6+
date: 2023-04-20
7+
tags: [zero_shot, tr, turkish, distilbert, base, cased, open_source, tensorflow]
8+
task: Zero-Shot Classification
9+
language: tr
10+
edition: Spark NLP 4.4.1
11+
spark_version: [3.2, 3.0]
12+
supported: true
13+
engine: tensorflow
14+
annotator: DistilBertForZeroShotClassification
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
This model is intended to be used for zero-shot text classification, especially in Trukish. It is fine-tuned on MNLI by using DistilBERT Base Uncased model.
23+
24+
DistilBertForZeroShotClassification using a ModelForSequenceClassification trained on NLI (natural language inference) tasks. Equivalent of DistilBertForSequenceClassification models, but these models don’t require a hardcoded number of potential classes, they can be chosen at runtime. It usually means it’s slower but it is much more flexible.
25+
26+
We used TFDistilBertForSequenceClassification to train this model and used DistilBertForZeroShotClassification annotator in Spark NLP 🚀 for prediction at scale!
27+
28+
## Predicted Entities
29+
30+
31+
32+
{:.btn-box}
33+
<button class="button button-orange" disabled>Live Demo</button>
34+
<button class="button button-orange" disabled>Open in Colab</button>
35+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr_4.4.1_3.2_1682014879417.zip){:.button.button-orange}
36+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_multinli_tr_4.4.1_3.2_1682014879417.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
37+
38+
## How to use
39+
40+
41+
42+
<div class="tabs-box" markdown="1">
43+
{% include programmingLanguageSelectScalaPythonNLU.html %}
44+
```python
45+
document_assembler = DocumentAssembler() \
46+
.setInputCol('text') \
47+
.setOutputCol('document')
48+
tokenizer = Tokenizer() \
49+
.setInputCols(['document']) \
50+
.setOutputCol('token')
51+
52+
zeroShotClassifier = DistilBertForZeroShotClassification \
53+
.pretrained('distilbert_base_zero_shot_classifier_turkish_cased_multinli', 'en') \
54+
.setInputCols(['token', 'document']) \
55+
.setOutputCol('class') \
56+
.setCaseSensitive(True) \
57+
.setMaxSentenceLength(512) \
58+
.setCandidateLabels(["ekonomi", "siyaset","spor"])
59+
60+
pipeline = Pipeline(stages=[
61+
document_assembler,
62+
tokenizer,
63+
zeroShotClassifier
64+
])
65+
example = spark.createDataFrame([['Dolar yükselmeye devam ediyor.']]).toDF("text")
66+
result = pipeline.fit(example).transform(example)
67+
```
68+
```scala
69+
val document_assembler = DocumentAssembler()
70+
.setInputCol("text")
71+
.setOutputCol("document")
72+
73+
val tokenizer = Tokenizer()
74+
.setInputCols("document")
75+
.setOutputCol("token")
76+
77+
val zeroShotClassifier = DistilBertForZeroShotClassification.pretrained("distilbert_base_zero_shot_classifier_turkish_cased_multinli", "en")
78+
.setInputCols("document", "token")
79+
.setOutputCol("class")
80+
.setCaseSensitive(true)
81+
.setMaxSentenceLength(512)
82+
.setCandidateLabels(Array("ekonomi", "siyaset","spor"))
83+
84+
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier))
85+
val example = Seq("Dolar yükselmeye devam ediyor.").toDS.toDF("text")
86+
val result = pipeline.fit(example).transform(example)
87+
```
88+
</div>
89+
90+
{:.model-param}
91+
## Model Information
92+
93+
{:.table-model}
94+
|---|---|
95+
|Model Name:|distilbert_base_zero_shot_classifier_turkish_cased_multinli|
96+
|Compatibility:|Spark NLP 4.4.1+|
97+
|License:|Open Source|
98+
|Edition:|Official|
99+
|Input Labels:|[token, document]|
100+
|Output Labels:|[multi_class]|
101+
|Language:|tr|
102+
|Size:|254.3 MB|
103+
|Case sensitive:|true|
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
---
2+
layout: model
3+
title: DistilBERTZero-Shot Classification Base - distilbert_base_zero_shot_classifier_turkish_cased_snli
4+
author: John Snow Labs
5+
name: distilbert_base_zero_shot_classifier_turkish_cased_snli
6+
date: 2023-04-20
7+
tags: [zero_shot, tr, turkish, distilbert, base, cased, open_source, tensorflow]
8+
task: Zero-Shot Classification
9+
language: tr
10+
edition: Spark NLP 4.4.1
11+
spark_version: [3.2, 3.0]
12+
supported: true
13+
engine: tensorflow
14+
annotator: DistilBertForZeroShotClassification
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
This model is intended to be used for zero-shot text classification, especially in Trukish. It is fine-tuned on MNLI by using DistilBERT Base Uncased model.
23+
24+
DistilBertForZeroShotClassification using a ModelForSequenceClassification trained on NLI (natural language inference) tasks. Equivalent of DistilBertForSequenceClassification models, but these models don’t require a hardcoded number of potential classes, they can be chosen at runtime. It usually means it’s slower but it is much more flexible.
25+
26+
We used TFDistilBertForSequenceClassification to train this model and used DistilBertForZeroShotClassification annotator in Spark NLP 🚀 for prediction at scale!
27+
28+
## Predicted Entities
29+
30+
31+
32+
{:.btn-box}
33+
<button class="button button-orange" disabled>Live Demo</button>
34+
<button class="button button-orange" disabled>Open in Colab</button>
35+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_snli_tr_4.4.1_3.2_1682015986268.zip){:.button.button-orange}
36+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/distilbert_base_zero_shot_classifier_turkish_cased_snli_tr_4.4.1_3.2_1682015986268.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
37+
38+
## How to use
39+
40+
41+
42+
<div class="tabs-box" markdown="1">
43+
{% include programmingLanguageSelectScalaPythonNLU.html %}
44+
```python
45+
document_assembler = DocumentAssembler() \
46+
.setInputCol('text') \
47+
.setOutputCol('document')
48+
49+
tokenizer = Tokenizer() \
50+
.setInputCols(['document']) \
51+
.setOutputCol('token')
52+
53+
zeroShotClassifier = DistilBertForZeroShotClassification \
54+
.pretrained('distilbert_base_zero_shot_classifier_turkish_cased_snli', 'en') \
55+
.setInputCols(['token', 'document']) \
56+
.setOutputCol('class') \
57+
.setCaseSensitive(True) \
58+
.setMaxSentenceLength(512) \
59+
.setCandidateLabels(["olumsuz", "olumlu"])
60+
61+
pipeline = Pipeline(stages=[
62+
document_assembler,
63+
tokenizer,
64+
zeroShotClassifier
65+
])
66+
example = spark.createDataFrame([['Senaryo çok saçmaydı, beğendim diyemem.']]).toDF("text")
67+
result = pipeline.fit(example).transform(example)
68+
```
69+
```scala
70+
val document_assembler = DocumentAssembler()
71+
.setInputCol("text")
72+
.setOutputCol("document")
73+
74+
val tokenizer = Tokenizer()
75+
.setInputCols("document")
76+
.setOutputCol("token")
77+
val zeroShotClassifier =
78+
79+
DistilBertForZeroShotClassification.pretrained("distilbert_base_zero_shot_classifier_turkish_cased_snli", "en")
80+
.setInputCols("document", "token")
81+
.setOutputCol("class")
82+
.setCaseSensitive(true)
83+
.setMaxSentenceLength(512)
84+
.setCandidateLabels(Array("olumsuz", "olumlu"))
85+
86+
val pipeline = new Pipeline().setStages(Array(document_assembler, tokenizer, zeroShotClassifier))
87+
val example = Seq("Senaryo çok saçmaydı, beğendim diyemem.").toDS.toDF("text")
88+
val result = pipeline.fit(example).transform(example)
89+
```
90+
</div>
91+
92+
{:.model-param}
93+
## Model Information
94+
95+
{:.table-model}
96+
|---|---|
97+
|Model Name:|distilbert_base_zero_shot_classifier_turkish_cased_snli|
98+
|Compatibility:|Spark NLP 4.4.1+|
99+
|License:|Open Source|
100+
|Edition:|Official|
101+
|Input Labels:|[token, document]|
102+
|Output Labels:|[multi_class]|
103+
|Language:|tr|
104+
|Size:|254.3 MB|
105+
|Case sensitive:|true|

0 commit comments

Comments
 (0)