Direct text classifier for thematic Arabic discourse documents

المؤلفون المشاركون

al-Khatib, Raid
al-Shunnaq, Muawiyah
Daradikah, Muhammad
Malkawi, Rami
Nahar, Khalid

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 17، العدد 3 (31 مايو/أيار 2020)، ص ص. 394-403، 10ص.

الناشر

جامعة الزرقاء عمادة البحث العلمي

تاريخ النشر

2020-05-31

دولة النشر

الأردن

عدد الصفحات

10

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الملخص EN

Maintaining the topical coherence while writing a discourse is a major challenge confronting novice and non-novice writers alike.

This challenge is even more intense with Arabic discourse because of the complex morphology and the widespread of synonyms in Arabic language.

In this research, we present a direct classification of Arabic discourse document while writing.

This prescriptive proposed framework consists of the following stages: data collection, pre-processing, construction of Language Model (LM), topics identification, topics classification, and topic notification.

To prove and demonstrate our proposed framework, we designed a system and applied it on a corpus of 2800 Arabic discourse documents synthesized into four predefined topics related to: Culture, Economy, Sport, and Religion.

System performance was analysed, in terms of accuracy, recall, precision, and F-measure.

The results demonstrated that the proposed topic modeling-based decision framework is able to classify topics while writing a discourse with accuracy of 91.0%.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Nahar, Khalid& al-Khatib, Raid& al-Shunnaq, Muawiyah& Daradikah, Muhammad& Malkawi, Rami. 2020. Direct text classifier for thematic Arabic discourse documents. The International Arab Journal of Information Technology،Vol. 17, no. 3, pp.394-403.
https://search.emarefa.net/detail/BIM-962353

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Nahar, Khalid…[et al.]. Direct text classifier for thematic Arabic discourse documents. The International Arab Journal of Information Technology Vol. 17, no. 3 (May. 2020), pp.394-403.
https://search.emarefa.net/detail/BIM-962353

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Nahar, Khalid& al-Khatib, Raid& al-Shunnaq, Muawiyah& Daradikah, Muhammad& Malkawi, Rami. Direct text classifier for thematic Arabic discourse documents. The International Arab Journal of Information Technology. 2020. Vol. 17, no. 3, pp.394-403.
https://search.emarefa.net/detail/BIM-962353

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references : p. 400-402

رقم السجل

BIM-962353