Challenges in building corpora for Algerian Arabic from CMC content

المؤلفون المشاركون

Bu Hania, Bashir
Umari, Muhammad

المصدر

El-Hakika (The Truth) Journal for Social And Human Sciences

العدد

المجلد 21، العدد 4 (31 ديسمبر/كانون الأول 2022)، ص ص. 594-617، 24ص.

الناشر

جامعة أحمد دراية

تاريخ النشر

2022-12-31

دولة النشر

الجزائر

عدد الصفحات

24

التخصصات الرئيسية

اللغة العربية وآدابها

الموضوعات

الملخص EN

Algerian Arabic is an under-resourced Arabic dialect.

few corpora and natural language processing tools were developed for it.

this is due to a variety of factors such as its lack of written content and of a standard orthography as well as the frequent code-switching and script switching exhibited by its speakers.

these factors render developing homogenous corpora for the dialect more challenging compared to other Arabic dialects where such factors are less pronounced.

the objective of this work is to examine the challenges and issues encountered in developing a corpus of Algerian Arabic extracted from computer-mediated communication content, primarily content on the social media platform Facebook and the story-publishing website Wattpad.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Umari, Muhammad& Bu Hania, Bashir. 2022. Challenges in building corpora for Algerian Arabic from CMC content. El-Hakika (The Truth) Journal for Social And Human Sciences،Vol. 21, no. 4, pp.594-617.
https://search.emarefa.net/detail/BIM-1467282

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Umari, Muhammad& Bu Hania, Bashir. Challenges in building corpora for Algerian Arabic from CMC content. El-Hakika (The Truth) Journal for Social And Human Sciences Vol. 21, no. 4 (Dec. 2022), pp.594-617.
https://search.emarefa.net/detail/BIM-1467282

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Umari, Muhammad& Bu Hania, Bashir. Challenges in building corpora for Algerian Arabic from CMC content. El-Hakika (The Truth) Journal for Social And Human Sciences. 2022. Vol. 21, no. 4, pp.594-617.
https://search.emarefa.net/detail/BIM-1467282

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references : p. 612-617

رقم السجل

BIM-1467282