Automatic Arabic text categorization using efficient classification techniques

العناوين الأخرى

التصنيف التلقائي للنصوص العربية باستخدام تقنيات التصنيف ذات الكفاءة

مقدم أطروحة جامعية

al-Awadi, Muhammad Mahmud

مشرف أطروحة جامعية

Hammad, Mustafa Muhammad

أعضاء اللجنة

al-Hasanat, Ahmad Bashir
al-Maani, Mudir Musa
al-Hammuri, Awni Mansur

الجامعة

جامعة مؤتة

الكلية

كلية تكنولوجيا المعلومات

دولة الجامعة

الأردن

الدرجة العلمية

ماجستير

تاريخ الدرجة العلمية

2015

الملخص الإنجليزي

Arabic language is a complex language that needs special treatment.

However, most previous studies were using statistical methods in Arabic texts classification, and these methods neglect meaning of the terms.

Firstly we built an identical Arabic database, so that they are freely available for research purposes in the Arabic language, then designed a framework for preprocessing Arabic text, which consists of multiple steps and modeling techniques, such as stop word removal and a stemmer to improve the results of Arabic texts categorization.

This thesis focuses on the semantics technique, and proposes a hybrid stemmer for Arabic languages.

Varies techniques are used to implement the Arabic text classifications, and to verify our hybrid stemmer.

These techniques include Latent semantic analysis (LSA) and five machine learning approaches.

LSA used to reduce dimensionality in order to improve the accuracy of categorization systems.

The experiment results showed the effectiveness of our Arabic stemmer in terms of classification accuracy and speed.

The best performance was achieved by combining Singular Value Decomposition (SVD) with cosine similarity measure and Manhattan distance.Finally, we Compared experimentally; Hassanat's distance with Euclidean's distance, Manhattan distance and cosine distance, to choose the best way to calculate the similarity between vectors with five text representation .

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

عدد الصفحات

109

قائمة المحتويات

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : The background.

Chapter Three : The proposed Arabic stemmer : Arabic text preprocessing.

Chapter Four : Experiments and results.

References.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

al-Awadi, Muhammad Mahmud. (2015). Automatic Arabic text categorization using efficient classification techniques. (Master's theses Theses and Dissertations Master). Mutah University, Jordan
https://search.emarefa.net/detail/BIM-729773

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

al-Awadi, Muhammad Mahmud. Automatic Arabic text categorization using efficient classification techniques. (Master's theses Theses and Dissertations Master). Mutah University. (2015).
https://search.emarefa.net/detail/BIM-729773

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

al-Awadi, Muhammad Mahmud. (2015). Automatic Arabic text categorization using efficient classification techniques. (Master's theses Theses and Dissertations Master). Mutah University, Jordan
https://search.emarefa.net/detail/BIM-729773

لغة النص

الإنجليزية

نوع البيانات

رسائل جامعية

رقم السجل

BIM-729773