A deep learning approach for the Romanized Tunisian dialect identification
Joint Authors
Yunus, Jihene
Ashur, Hadhemi
Suwaysi, Aminah
Ferchichi, Ahmad
Source
The International Arab Journal of Information Technology
Issue
Vol. 17, Issue 6 (30 Nov. 2020), pp.935-946, 12 p.
Publisher
Zarqa University Deanship of Scientific Research
Publication Date
2020-11-30
Country of Publication
Jordan
No. of Pages
12
Main Subjects
Information Technology and Computer Science
Abstract EN
Language identification is an important task in natural language processing that consists of determining the language of a given text.
It has increasingly picked the interest of researchers for the past few years, especially for code-switching informal textual content.
This paper, focuses on the identification of the Romanized user-generated Tunisian dialect on the social web.
Segmented and annotated a corpus extracted from social media and propose a deep learning approach for the identification task.
A Bidirectional Long Short-Term Memory neural network with Conditional Random Fields decoding (BLSTM-CRF) had been used.
For word embeddings, a combination of word-character BLSTM vector representation and Fast Text embeddings that takes into consideration character n-gram features.
The overall accuracy obtained is 98.65%.
American Psychological Association (APA)
Yunus, Jihene& Ashur, Hadhemi& Suwaysi, Aminah& Ferchichi, Ahmad. 2020. A deep learning approach for the Romanized Tunisian dialect identification. The International Arab Journal of Information Technology،Vol. 17, no. 6, pp.935-946.
https://search.emarefa.net/detail/BIM-1434011
Modern Language Association (MLA)
Yunus, Jihene…[et al.]. A deep learning approach for the Romanized Tunisian dialect identification. The International Arab Journal of Information Technology Vol. 17, no. 6 (Nov. 2020), pp.935-946.
https://search.emarefa.net/detail/BIM-1434011
American Medical Association (AMA)
Yunus, Jihene& Ashur, Hadhemi& Suwaysi, Aminah& Ferchichi, Ahmad. A deep learning approach for the Romanized Tunisian dialect identification. The International Arab Journal of Information Technology. 2020. Vol. 17, no. 6, pp.935-946.
https://search.emarefa.net/detail/BIM-1434011
Data Type
Journal Articles
Language
English
Notes
Includes bibliographical references : p. 943-946
Record ID
BIM-1434011