File tree Expand file tree Collapse file tree 1 file changed +5
-5
lines changed
Expand file tree Collapse file tree 1 file changed +5
-5
lines changed Original file line number Diff line number Diff line change 33<b >The Dataset for Hate Speech Detection in the Indonesian Language</b ><br >
44
55<b >Dataset</b ><br >
6- The dataset is a two columns data of: label - tweet, consist of 713 tweets in the Indonesian language.
7- The label is Non_HS or HS. Non_HS for "non-hate-speech" tweet and HS for "hate-speech" tweet.
8- Number of Non_HT tweets: 453
9- Number of HT tweets: 260
10- Since this dataset is unbalance, you might have to do over-sampling/down-sampling in order to create a balanced dataset.
6+ The dataset is a two columns data of: label - tweet, consist of 713 tweets in the Indonesian language. < br >
7+ The label is Non_HS or HS. Non_HS for "non-hate-speech" tweet and HS for "hate-speech" tweet. < br >
8+ Number of Non_HT tweets: 453< br >
9+ Number of HT tweets: 260 < br >
10+ Since this dataset is unbalance, you might have to do over-sampling/down-sampling in order to create a balanced dataset. < br >
1111
1212The dataset may be used freely, but if you want to publish paper using the dataset, please cite this publication:
1313
You can’t perform that action at this time.
0 commit comments