Automatic topic classification system of spoken Arabic news
العناوين الأخرى
النظام الآلي للتصنيف الموضوعي للأخبار المنطوقة باللغة العربية
مقدم أطروحة جامعية
Abu Sulayman, Nasir Sadiq Abd Allah
مشرف أطروحة جامعية
al-Hanjuri, Muhammad Ahmad Muhammad
الجامعة
الجامعة الإسلامية
الكلية
كلية الهندسة
القسم الأكاديمي
قسم هندسة الحاسوب
دولة الجامعة
فلسطين (قطاع غزة)
الدرجة العلمية
ماجستير
تاريخ الدرجة العلمية
2017
الملخص الإنجليزي
One of the most important consequences of what is known as the "Internet era" is the widespread of varied electronic data.
This deployment urgently requires an automated system to classify these data to facilitate search and access to the topic in question.
This system is commonly used in written texts.
Because of the huge increase of spoken files nowadays, there is an acute need for building an automatic system to classify spoken files based on topics.
This system has been discussed in the previous researches applied to spoken English texts, but it rarely takes into consideration spoken Arabic texts because Arabic language is challenging and its dataset is rare.
To deal with this challenge, a new dataset is established depending on converting the common written text (ALJAZEERA-NEWS) which is widely used in researches in classifying written texts.
Then, keywords extraction method is implemented in order to extract the keywords representing each class depending on using dynamic time warping.
Finally, topic identification, based on (Mel-frequency Cepstral Coefficients and Relative Spectral Transform - Perceptual Linear Prediction) as speech features and (Dynamic Time Warping and Hidden Markov Models) as classifiers, is created using a technique that is different from the traditional way, using an automatic speech recognition to extract the transcriptions. Segmentation method is proposed to deal with the segmentation of spoken files into words.
Regarding the evaluation of the system, accuracy, F1-measure, precision and recall are used as evaluation metrics.
The proposed system shows positive results in the topic classification field.
The F1-measure metric for topic identification system using dynamic time warping classifier records 90.26% and 91.36% using hidden Markov models classifier in the average.
In addition, the system achieves 89.65% of keywords identification accuracy
التخصصات الرئيسية
تكنولوجيا المعلومات وعلم الحاسوب
عدد الصفحات
96
قائمة المحتويات
Table of contents.
Abstract.
Abstract in Arabic.
Chapter One : Introduction.
Chapter Two : Related works.
Chapter Three : Background theory.
Chapter Four : Proposed work.
Chapter Five : Results and discussion.
Chapter Six : Conclusions and recommendations.
References.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
Abu Sulayman, Nasir Sadiq Abd Allah. (2017). Automatic topic classification system of spoken Arabic news. (Master's theses Theses and Dissertations Master). Islamic University, Palestine (Gaza Strip)
https://search.emarefa.net/detail/BIM-905179
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Abu Sulayman, Nasir Sadiq Abd Allah. Automatic topic classification system of spoken Arabic news. (Master's theses Theses and Dissertations Master). Islamic University. (2017).
https://search.emarefa.net/detail/BIM-905179
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
Abu Sulayman, Nasir Sadiq Abd Allah. (2017). Automatic topic classification system of spoken Arabic news. (Master's theses Theses and Dissertations Master). Islamic University, Palestine (Gaza Strip)
https://search.emarefa.net/detail/BIM-905179
لغة النص
الإنجليزية
نوع البيانات
رسائل جامعية
رقم السجل
BIM-905179
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر