Challenges in building corpora for Algerian Arabic from CMC content
Joint Authors
Bu Hania, Bashir
Umari, Muhammad
Source
El-Hakika (The Truth) Journal for Social And Human Sciences
Issue
Vol. 21, Issue 4 (31 Dec. 2022), pp.594-617, 24 p.
Publisher
Publication Date
2022-12-31
Country of Publication
Algeria
No. of Pages
24
Main Subjects
Arabic language and Literature
Topics
- Social networking Web sites
- Algeria
- Computational linguistics
- Dialects
- Arabic language
- Facebook(Electronic resource)
Abstract EN
Algerian Arabic is an under-resourced Arabic dialect.
few corpora and natural language processing tools were developed for it.
this is due to a variety of factors such as its lack of written content and of a standard orthography as well as the frequent code-switching and script switching exhibited by its speakers.
these factors render developing homogenous corpora for the dialect more challenging compared to other Arabic dialects where such factors are less pronounced.
the objective of this work is to examine the challenges and issues encountered in developing a corpus of Algerian Arabic extracted from computer-mediated communication content, primarily content on the social media platform Facebook and the story-publishing website Wattpad.
American Psychological Association (APA)
Umari, Muhammad& Bu Hania, Bashir. 2022. Challenges in building corpora for Algerian Arabic from CMC content. El-Hakika (The Truth) Journal for Social And Human Sciences،Vol. 21, no. 4, pp.594-617.
https://search.emarefa.net/detail/BIM-1467282
Modern Language Association (MLA)
Umari, Muhammad& Bu Hania, Bashir. Challenges in building corpora for Algerian Arabic from CMC content. El-Hakika (The Truth) Journal for Social And Human Sciences Vol. 21, no. 4 (Dec. 2022), pp.594-617.
https://search.emarefa.net/detail/BIM-1467282
American Medical Association (AMA)
Umari, Muhammad& Bu Hania, Bashir. Challenges in building corpora for Algerian Arabic from CMC content. El-Hakika (The Truth) Journal for Social And Human Sciences. 2022. Vol. 21, no. 4, pp.594-617.
https://search.emarefa.net/detail/BIM-1467282
Data Type
Journal Articles
Language
English
Notes
Includes bibliographical references : p. 612-617
Record ID
BIM-1467282