A markovian approach for Arabic root extraction

المؤلفون المشاركون

Boudlal, Abd al-Rahman
Belahbib, Rashid
Lakhouaja, Abd al-Haqq
Mazroui, Izz al-Din
Mizyan, Abd al-ouafi
Bebah, Muhammad

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 8، العدد 1 (31 يناير/كانون الثاني 2011)، ص ص. 91-98، 8ص.

الناشر

جامعة الزرقاء

تاريخ النشر

2011-01-31

دولة النشر

الأردن

عدد الصفحات

8

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الموضوعات

الملخص EN

In this paper, we present an Arabic morphological analysis system that assigns, for each word of an unvoweled Arabic sentence, a unique root depending on the context.

The proposed system is composed of two modules.

The first one consists of an analysis out of context.

In this module, we segment each word of the sentence into its elementary morphological units in order to identify its possible roots.

For that, we adopt the segmentation of the word into three parts (prefix, stem, suffix).

In the second module we use the context to identify the correct root among all the possible roots of the word.

For this purpose, we use a Hidden Markov Models approach, where the observations are the words and the possible roots represent the hidden states.

We validate the approach using the NEMLAR Arabic writing corpus consisting of 500,000 words.

The system gives the correct root in more than 98% of the training set, and in almost 94% of the words in the testing set.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Boudlal, Abd al-Rahman& Belahbib, Rashid& Lakhouaja, Abd al-Haqq& Mazroui, Izz al-Din& Mizyan, Abd al-ouafi& Bebah, Muhammad. 2011. A markovian approach for Arabic root extraction. The International Arab Journal of Information Technology،Vol. 8, no. 1, pp.91-98.
https://search.emarefa.net/detail/BIM-244543

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Boudlal, Abd al-Rahman…[et al.]. A markovian approach for Arabic root extraction. The International Arab Journal of Information Technology Vol. 8, no. 1 (Jan. 2011), pp.91-98.
https://search.emarefa.net/detail/BIM-244543

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Boudlal, Abd al-Rahman& Belahbib, Rashid& Lakhouaja, Abd al-Haqq& Mazroui, Izz al-Din& Mizyan, Abd al-ouafi& Bebah, Muhammad. A markovian approach for Arabic root extraction. The International Arab Journal of Information Technology. 2011. Vol. 8, no. 1, pp.91-98.
https://search.emarefa.net/detail/BIM-244543

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references : p. 96-98

رقم السجل

BIM-244543