Create dataset loader for IJELID (Indonesian-Javanese-English Code-Mixed Language Identification)


  NusaCatalogue: https://indonlp.github.io/nusa-catalogue/card.html?ijelid
  
  |     Dataset        | ijelid  |
  |-------------|---|
  | Description | This is a clean version of code-mixed Indonesian-Javanese-English data for token level language identification. We name this dataset as IJELID (Indonesian-Javanese-English Language Identification). This dataset contains tweets that have been tokenized with the corresponding token and its language label. There are seven language labels in the dataset, namely:  ID (Indonesian), JV (Javanese), EN (English), MIX_ID_EN (mixed Indonesian-English), MIX_ID_JV (mixed Indonesian-Javanese), MIX_JV_EN (mixed Javanese-English), OTH (Other).  |
  | License     | CC-BY 4.0  |
  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create dataset loader for IJELID (Indonesian-Javanese-English Code-Mixed Language Identification) #345

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dataset	ijelid
Description	This is a clean version of code-mixed Indonesian-Javanese-English data for token level language identification. We name this dataset as IJELID (Indonesian-Javanese-English Language Identification). This dataset contains tweets that have been tokenized with the corresponding token and its language label. There are seven language labels in the dataset, namely: ID (Indonesian), JV (Javanese), EN (English), MIX_ID_EN (mixed Indonesian-English), MIX_ID_JV (mixed Indonesian-Javanese), MIX_JV_EN (mixed Javanese-English), OTH (Other).
License	CC-BY 4.0

Create dataset loader for IJELID (Indonesian-Javanese-English Code-Mixed Language Identification) #345

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions