Comparison some of Arabic text classification techniques using a multinomial mixture model

Other Title(s)

مقارنة بعض تقنيات تصنيف النصوص العربية باستخدام نموذج خليط متعدد الحدود

Dissertant

Hasan, Siham Abd al-Hadi

Thesis advisor

Kanan, Ghassan

Comitee Members

al-Shalabi, Riyad F.
Dabbas, Umar

University

Amman Arab University

Faculty

Collage of Computer Sciences and Informatics

Department

Department of Computer Science

University Country

Jordan

Degree

Master

Degree Date

2014

English Abstract

Text Classification (TC) assigns documents to one or more predefined categories based on their contents.

This project focuses on the comparison of three automatic TC techniques: Rocchio, K-Nearest Neighbor (KNN) and Naïve Bayes (NB) classifier using a multinomial mixture model (MMM) on Arabic language.

In order to evaluate the mentioned techniques using the MMM, an Arabic TC corpus that consists of 1445 Arabic documents are classified into nine categories: Computer, Economics, Education, Sport, Politics, Engineer, Medicine, Law, and Religion.

The main goal of this project is to compare some of automatic text classification technique using a multinomial mixture model on the Arabic language.

The classification effectiveness has been compared with the SVM model.

This model was applied in other project used the same traditional classifiers and the same collection.

Moreover; the experimental results are presented in terms of macro-averaging precision, macro-averaging recall, and macro-averagingF1 measures.

Furthermore, the results reveal that the naive Bayes using MMM work best for Arabic TC tasks and outperformed k-NN and Rocchio classifiers.

Main Subjects

Mathematics

Topics

No. of Pages

63

Table of Contents

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : Literature review.

Chapter Three : Methodology.

Chapter Four : Experiments and evaluation.

Chapter Five : Conclusion and future work.

References.

American Psychological Association (APA)

Hasan, Siham Abd al-Hadi. (2014). Comparison some of Arabic text classification techniques using a multinomial mixture model. (Master's theses Theses and Dissertations Master). Amman Arab University, Jordan
https://search.emarefa.net/detail/BIM-561894

Modern Language Association (MLA)

Hasan, Siham Abd al-Hadi. Comparison some of Arabic text classification techniques using a multinomial mixture model. (Master's theses Theses and Dissertations Master). Amman Arab University. (2014).
https://search.emarefa.net/detail/BIM-561894

American Medical Association (AMA)

Hasan, Siham Abd al-Hadi. (2014). Comparison some of Arabic text classification techniques using a multinomial mixture model. (Master's theses Theses and Dissertations Master). Amman Arab University, Jordan
https://search.emarefa.net/detail/BIM-561894

Language

English

Data Type

Arab Theses

Record ID

BIM-561894