![](/images/graphics-bg.png)
An Arabic lemma-based stemmer for latent topic modeling
Joint Authors
Benyettou, Abd al-Qadir
Brahmi, Abd al-Razzaq
al-Sharif, Ahmad T.
Source
The International Arab Journal of Information Technology
Issue
Vol. 10, Issue 2 (31 Mar. 2013)9 p.
Publisher
Publication Date
2013-03-31
Country of Publication
Jordan
No. of Pages
9
Main Subjects
Languages & Comparative Literature
Topics
Abstract EN
Development in Arabic information retrieval did not follow the increasing use of the Arabic Web during the last decade.
Semantic indexing in a language with high inflectional morphology, such as Arabic, is not a trivial task and requires a text analysis in the original language.
Excepting cross-language retrieval methods or limited studies, the main efforts, for developing semantic analysis methods and topic modeling, did not include Arabic text.
This paper describes our approach for analyzing semantics in Arabic texts.
A new lemma-based stemmer is developed and compared to root-based one for characterizing Arabic text.
The Latent Dirichlet Allocation (LDA) model is adapted to extract Arabic latent topics from various real-world corpora.
In addition to the interesting subjects discovered in the press articles during the 2007-2009 period, experiments show that the classification performances with lemma-based stemming in the topics space, are improved when comparing to classification with root-based stemming.
American Psychological Association (APA)
Brahmi, Abd al-Razzaq& al-Sharif, Ahmad T.& Benyettou, Abd al-Qadir. 2013. An Arabic lemma-based stemmer for latent topic modeling. The International Arab Journal of Information Technology،Vol. 10, no. 2.
https://search.emarefa.net/detail/BIM-311948
Modern Language Association (MLA)
Benyettou, Abd al-Qadir…[et al.]. An Arabic lemma-based stemmer for latent topic modeling. The International Arab Journal of Information Technology Vol. 10, no. 2 (Mar. 2013).
https://search.emarefa.net/detail/BIM-311948
American Medical Association (AMA)
Brahmi, Abd al-Razzaq& al-Sharif, Ahmad T.& Benyettou, Abd al-Qadir. An Arabic lemma-based stemmer for latent topic modeling. The International Arab Journal of Information Technology. 2013. Vol. 10, no. 2.
https://search.emarefa.net/detail/BIM-311948
Data Type
Journal Articles
Language
English
Notes
Includes bibliographical references.
Record ID
BIM-311948