An Arabic lemma-based stemmer for latent topic modeling

المؤلفون المشاركون

Benyettou, Abd al-Qadir
Brahmi, Abd al-Razzaq
al-Sharif, Ahmad T.

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 10، العدد 2 (31 مارس/آذار 2013)9ص.

الناشر

جامعة الزرقاء

تاريخ النشر

2013-03-31

دولة النشر

الأردن

عدد الصفحات

9

التخصصات الرئيسية

اللغات والآداب المقارنة

الموضوعات

الملخص EN

Development in Arabic information retrieval did not follow the increasing use of the Arabic Web during the last decade.

Semantic indexing in a language with high inflectional morphology, such as Arabic, is not a trivial task and requires a text analysis in the original language.

Excepting cross-language retrieval methods or limited studies, the main efforts, for developing semantic analysis methods and topic modeling, did not include Arabic text.

This paper describes our approach for analyzing semantics in Arabic texts.

A new lemma-based stemmer is developed and compared to root-based one for characterizing Arabic text.

The Latent Dirichlet Allocation (LDA) model is adapted to extract Arabic latent topics from various real-world corpora.

In addition to the interesting subjects discovered in the press articles during the 2007-2009 period, experiments show that the classification performances with lemma-based stemming in the topics space, are improved when comparing to classification with root-based stemming.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Brahmi, Abd al-Razzaq& al-Sharif, Ahmad T.& Benyettou, Abd al-Qadir. 2013. An Arabic lemma-based stemmer for latent topic modeling. The International Arab Journal of Information Technology،Vol. 10, no. 2.
https://search.emarefa.net/detail/BIM-311948

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Benyettou, Abd al-Qadir…[et al.]. An Arabic lemma-based stemmer for latent topic modeling. The International Arab Journal of Information Technology Vol. 10, no. 2 (Mar. 2013).
https://search.emarefa.net/detail/BIM-311948

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Brahmi, Abd al-Razzaq& al-Sharif, Ahmad T.& Benyettou, Abd al-Qadir. An Arabic lemma-based stemmer for latent topic modeling. The International Arab Journal of Information Technology. 2013. Vol. 10, no. 2.
https://search.emarefa.net/detail/BIM-311948

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references.

رقم السجل

BIM-311948