A deep learning approach for the Romanized Tunisian dialect identification

المؤلفون المشاركون

Yunus, Jihene
Ashur, Hadhemi
Suwaysi, Aminah
Ferchichi, Ahmad

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 17، العدد 6 (30 نوفمبر/تشرين الثاني 2020)، ص ص. 935-946، 12ص.

الناشر

جامعة الزرقاء عمادة البحث العلمي

تاريخ النشر

2020-11-30

دولة النشر

الأردن

عدد الصفحات

12

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الملخص EN

Language identification is an important task in natural language processing that consists of determining the language of a given text.

It has increasingly picked the interest of researchers for the past few years, especially for code-switching informal textual content.

This paper, focuses on the identification of the Romanized user-generated Tunisian dialect on the social web.

Segmented and annotated a corpus extracted from social media and propose a deep learning approach for the identification task.

A Bidirectional Long Short-Term Memory neural network with Conditional Random Fields decoding (BLSTM-CRF) had been used.

For word embeddings, a combination of word-character BLSTM vector representation and Fast Text embeddings that takes into consideration character n-gram features.

The overall accuracy obtained is 98.65%.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Yunus, Jihene& Ashur, Hadhemi& Suwaysi, Aminah& Ferchichi, Ahmad. 2020. A deep learning approach for the Romanized Tunisian dialect identification. The International Arab Journal of Information Technology،Vol. 17, no. 6, pp.935-946.
https://search.emarefa.net/detail/BIM-1434011

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Yunus, Jihene…[et al.]. A deep learning approach for the Romanized Tunisian dialect identification. The International Arab Journal of Information Technology Vol. 17, no. 6 (Nov. 2020), pp.935-946.
https://search.emarefa.net/detail/BIM-1434011

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Yunus, Jihene& Ashur, Hadhemi& Suwaysi, Aminah& Ferchichi, Ahmad. A deep learning approach for the Romanized Tunisian dialect identification. The International Arab Journal of Information Technology. 2020. Vol. 17, no. 6, pp.935-946.
https://search.emarefa.net/detail/BIM-1434011

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references : p. 943-946

رقم السجل

BIM-1434011