![](/images/graphics-bg.png)
A hybrid statistical and morphological Arabic language diacritizing system
Other Title(s)
نظام تشكيل اللغة العربية الهجين الإحصائي و الصرفي
Dissertant
Thesis advisor
Comitee Members
University
Middle East University
Faculty
Faculty of Information Technology
Department
Computer Science Department
University Country
Jordan
Degree
Master
Degree Date
2012
English Abstract
This thesis represents a hybrid Arabic diacritizing system.
The main objective of this thesis is to build a system to diacritize Arabic text automatically using statistical model and morph-syntactical model.
The first part of this system determines the most likely diacritics by choosing the full-form Arabic sub-sentence diacritization with the highest weight and probability estimation.
The second part of the system factorizes and tokenizes each Arabic word into its possible morpho-syntactical constituent pattern, prefix, suffix, stem and root.
After factorizing, the morpho-syntactical part selects the most likely diacritization sequence from different factorizations of the word.
Most of the previous works on diacritization depend on tools such as Hidden Markov Model Toolkit (HTK) and/or higher linguistic knowledge such as morphology and syntax only, while this system uses statistical machine translation algorithm and ELXIRFM morphological analyzer.
The accuracy rate of this hybrid system is higher than the rates of traditional studies with larger domain of Arabic words.
Main Subjects
Information Technology and Computer Science
No. of Pages
69
Table of Contents
Table of contents.
Abstract.
Abstract in Arabic.
Chapter One : Introduction.
Chapter Two : Literature survey and related work.
Chapter Three : Arabic morpho-syntactical analysis.
Chapter Four : Statistical machine translation.
Chapter Five : Proposed model and methodology.
Chapter Six : Experiments results.
Chapter Seven : Conclusion and future work.
References.
American Psychological Association (APA)
Hattab, Abd Allah al-Mamun. (2012). A hybrid statistical and morphological Arabic language diacritizing system. (Master's theses Theses and Dissertations Master). Middle East University, Jordan
https://search.emarefa.net/detail/BIM-693803
Modern Language Association (MLA)
Hattab, Abd Allah al-Mamun. A hybrid statistical and morphological Arabic language diacritizing system. (Master's theses Theses and Dissertations Master). Middle East University. (2012).
https://search.emarefa.net/detail/BIM-693803
American Medical Association (AMA)
Hattab, Abd Allah al-Mamun. (2012). A hybrid statistical and morphological Arabic language diacritizing system. (Master's theses Theses and Dissertations Master). Middle East University, Jordan
https://search.emarefa.net/detail/BIM-693803
Language
English
Data Type
Arab Theses
Record ID
BIM-693803