Contextual text categorization : an improved stemming algorithm to increase the quality of categorization in Arabic text

المؤلفون المشاركون

Qadri, Said
Musawi, Abd al-Wahhab

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 14، العدد 6 (30 نوفمبر/تشرين الثاني 2017)7ص.

الناشر

جامعة الزرقاء

تاريخ النشر

2017-11-30

دولة النشر

الأردن

عدد الصفحات

7

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الملخص EN

One of the methods used to reduce the size of terms vocabulary in Arabic text categorization is to replace the different variants (forms) of words by their common root.

This process is called stemming based on the extraction of the root.

Therefore, the search of the root in Arabic or Arabic word root extraction is more difficult than in other languages since the Arabic language has a very different and difficult structure, that is because it is a very rich language with complex morphology.

Many algorithms are proposed in this field.

Some of them are based on morphological rules and grammatical patterns, thus they are quite difficult and require deep linguistic knowledge.

Others are statistical, so they are less difficult and based only on some calculations.

In this paper we propose an improved stemming algorithm based on the extraction of the root and the technique of n-grams which permit to return Arabic words’ stems without using any morphological rules or grammatical patterns

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Qadri, Said& Musawi, Abd al-Wahhab. 2017. Contextual text categorization : an improved stemming algorithm to increase the quality of categorization in Arabic text. The International Arab Journal of Information Technology،Vol. 14, no. 6.
https://search.emarefa.net/detail/BIM-853083

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Qadri, Said& Musawi, Abd al-Wahhab. Contextual text categorization : an improved stemming algorithm to increase the quality of categorization in Arabic text. The International Arab Journal of Information Technology Vol. 14, no. 6 (Nov. 2017).
https://search.emarefa.net/detail/BIM-853083

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Qadri, Said& Musawi, Abd al-Wahhab. Contextual text categorization : an improved stemming algorithm to increase the quality of categorization in Arabic text. The International Arab Journal of Information Technology. 2017. Vol. 14, no. 6.
https://search.emarefa.net/detail/BIM-853083

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references

رقم السجل

BIM-853083