An intelligent system for arabic text categorization

المؤلفون المشاركون

Siyam, M. M.
Fayid, Z. T.
Habib, M. B.

المصدر

International Journal of Intelligent Computing and Information Sciences

العدد

المجلد 6، العدد 1 (31 يناير/كانون الثاني 2006)19ص.

الناشر

جامعة عين شمس كلية الحاسبات و المعلومات

تاريخ النشر

2006-01-31

دولة النشر

مصر

عدد الصفحات

19

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الموضوعات

الملخص EN

Text Categorization (classification) is the process of classifying documents into a predefined set of categories based on their content.

In this paper, an intelligent Arabic text categorization system is presented.

Machine learning algorithms are used in this system.

Many algorithms for stemming and feature selection are tried.

Moreover, the document is represented using several term weighting schemes and finally the k-nearest neighbor and Rocchio classifiers are used for classification process.

Experiments are performed over self-collected data corpus and the results show that the suggested hybrid method of statistical and light stemmers is the most suitable stemming algorithm for Arabic language.

The results also show that a hybrid approach of document frequency and information gain is the preferable feature selection criterion and normalized-tfidf is the best weighting scheme.

Finally, Rocchio classifier has the advantage over k-nearest neighbor classifier in the classification process.

The experimental results illustrate that the proposed model is an efficient method and gives generalization accuracy of about 98 %.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Siyam, M. M.& Fayid, Z. T.& Habib, M. B.. 2006. An intelligent system for arabic text categorization. International Journal of Intelligent Computing and Information Sciences،Vol. 6, no. 1.
https://search.emarefa.net/detail/BIM-284442

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Siyam, M. M.…[et al.]. An intelligent system for arabic text categorization. International Journal of Intelligent Computing and Information Sciences Vol. 6, no. 1 (Jan. 2006).
https://search.emarefa.net/detail/BIM-284442

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Siyam, M. M.& Fayid, Z. T.& Habib, M. B.. An intelligent system for arabic text categorization. International Journal of Intelligent Computing and Information Sciences. 2006. Vol. 6, no. 1.
https://search.emarefa.net/detail/BIM-284442

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references.

رقم السجل

BIM-284442