Arabic text classification techniques using the multivariate Bernoulli model
Other Title(s)
تقنيات تصنيف النصوص العربية باستخدام نموذج متعدد المتغيرات برنولي
Dissertant
Thesis advisor
Comitee Members
al-Dabbas, Umar Ṣuhaib
al-Hamami, Ala Husayn
University
Amman Arab University
Faculty
Collage of Computer Sciences and Informatics
Department
Department of Computer Science
University Country
Jordan
Degree
Master
Degree Date
2013
English Abstract
-Document classification is currently one of the most important areas of information retrieval.
It aims to mapping text documents into one or more predefined class or category based on its contents of keywords.
This research study focuses on problem of Arabic text classification using Naïve Bayes (NB) and Multivariate Bernoulli Models (MBM) NB and MBM classifiers have been compared with K nearest neighbor K-NN and Rocchio classifiers.
Experiments will be conducted by using a corpus that consists of more than 1445 Arabic documents that are classified into nine categories.
The research evaluates these techniques using the standard of recall, precision, and f-measure as the basis of comparison.
The experiments have concluded that the effectiveness of the NB using MBM classifier is very significant.
It outperformed k-NN and Rocchio classifiers.
MBM macro-precision and macro-recalls reached to 0.86 and 0.831 respectively.In general, Naive Bayes algorithm using MBM has outperforms the two classifiers: KNN and Rocchio.
Naive Bayes algorithm using MBM classifier has the best precision.
The Rocchio classifier comes in the second place.
The worst classifier of this data set was k-NN classifier.The results can be slightly better if we increase the number of documents to 5070.
Main Subjects
Information Technology and Computer Science
Topics
No. of Pages
67
Table of Contents
Table of contents.
Abstract.
Abstract in Arabic.
Chapter One : Introduction.
Chapter Two : Literature reviews.
Chapter Three : The methodology.
Chapter Four : Experiments and evaluation.
Chapter Five : Conclusion and future work.
References.
American Psychological Association (APA)
al-Arqat, Latifah Faraj. (2013). Arabic text classification techniques using the multivariate Bernoulli model. (Master's theses Theses and Dissertations Master). Amman Arab University, Jordan
https://search.emarefa.net/detail/BIM-529295
Modern Language Association (MLA)
al-Arqat, Latifah Faraj. Arabic text classification techniques using the multivariate Bernoulli model. (Master's theses Theses and Dissertations Master). Amman Arab University. (2013).
https://search.emarefa.net/detail/BIM-529295
American Medical Association (AMA)
al-Arqat, Latifah Faraj. (2013). Arabic text classification techniques using the multivariate Bernoulli model. (Master's theses Theses and Dissertations Master). Amman Arab University, Jordan
https://search.emarefa.net/detail/BIM-529295
Language
English
Data Type
Arab Theses
Record ID
BIM-529295