Automatic topic classification system of spoken Arabic news
Other Title(s)
النظام الآلي للتصنيف الموضوعي للأخبار المنطوقة باللغة العربية
Dissertant
Abu Sulayman, Nasir Sadiq Abd Allah
Thesis advisor
al-Hanjuri, Muhammad Ahmad Muhammad
University
Islamic University
Faculty
Faculty of Engineering
Department
Department of Computer Engineering
University Country
Palestine (Gaza Strip)
Degree
Master
Degree Date
2017
English Abstract
One of the most important consequences of what is known as the "Internet era" is the widespread of varied electronic data.
This deployment urgently requires an automated system to classify these data to facilitate search and access to the topic in question.
This system is commonly used in written texts.
Because of the huge increase of spoken files nowadays, there is an acute need for building an automatic system to classify spoken files based on topics.
This system has been discussed in the previous researches applied to spoken English texts, but it rarely takes into consideration spoken Arabic texts because Arabic language is challenging and its dataset is rare.
To deal with this challenge, a new dataset is established depending on converting the common written text (ALJAZEERA-NEWS) which is widely used in researches in classifying written texts.
Then, keywords extraction method is implemented in order to extract the keywords representing each class depending on using dynamic time warping.
Finally, topic identification, based on (Mel-frequency Cepstral Coefficients and Relative Spectral Transform - Perceptual Linear Prediction) as speech features and (Dynamic Time Warping and Hidden Markov Models) as classifiers, is created using a technique that is different from the traditional way, using an automatic speech recognition to extract the transcriptions. Segmentation method is proposed to deal with the segmentation of spoken files into words.
Regarding the evaluation of the system, accuracy, F1-measure, precision and recall are used as evaluation metrics.
The proposed system shows positive results in the topic classification field.
The F1-measure metric for topic identification system using dynamic time warping classifier records 90.26% and 91.36% using hidden Markov models classifier in the average.
In addition, the system achieves 89.65% of keywords identification accuracy
Main Subjects
Information Technology and Computer Science
No. of Pages
96
Table of Contents
Table of contents.
Abstract.
Abstract in Arabic.
Chapter One : Introduction.
Chapter Two : Related works.
Chapter Three : Background theory.
Chapter Four : Proposed work.
Chapter Five : Results and discussion.
Chapter Six : Conclusions and recommendations.
References.
American Psychological Association (APA)
Abu Sulayman, Nasir Sadiq Abd Allah. (2017). Automatic topic classification system of spoken Arabic news. (Master's theses Theses and Dissertations Master). Islamic University, Palestine (Gaza Strip)
https://search.emarefa.net/detail/BIM-905179
Modern Language Association (MLA)
Abu Sulayman, Nasir Sadiq Abd Allah. Automatic topic classification system of spoken Arabic news. (Master's theses Theses and Dissertations Master). Islamic University. (2017).
https://search.emarefa.net/detail/BIM-905179
American Medical Association (AMA)
Abu Sulayman, Nasir Sadiq Abd Allah. (2017). Automatic topic classification system of spoken Arabic news. (Master's theses Theses and Dissertations Master). Islamic University, Palestine (Gaza Strip)
https://search.emarefa.net/detail/BIM-905179
Language
English
Data Type
Arab Theses
Record ID
BIM-905179