The semantic similarity measures using Arabic ontology
العناوين الأخرى
مقاييس التشابه الدلالي باستخدام الانتولوجي العربي
مقدم أطروحة جامعية
مشرف أطروحة جامعية
أعضاء اللجنة
al-Sadi, Jihad
Abu Shurayhah, Ahmad
الجامعة
جامعة الشرق الأوسط
الكلية
كلية تكنولوجيا المعلومات
القسم الأكاديمي
قسم علم الحاسوب
دولة الجامعة
الأردن
الدرجة العلمية
ماجستير
تاريخ الدرجة العلمية
2017
الملخص الإنجليزي
The semantic similarity measures have been used in many applications including information retrieval and natural language processing.
There are many measures that use a lexical database such as WordNet to calculate the similarity between English concepts.
However, few researches have been studied semantic similarity measures using Arabic WordNet.
The traditional semantic similarity measures were classified into four categories: path-based measures, information content-based, feature-based measures, and hybrid measures.
Several measures from different categories have been applied on Arabic WordNet to which measure has the best performance using Arabic WordNet.
Human benchmark has been used to evaluate the performance of these measures over Arabic WordNet.
Experimental results show that the WuP measure has achieved the minimum mean square error (MSE) with value of (1.64%), and highest value of correlation coefficient with human ratings (0.92).
These results indicate that WuP measure has the best performance on Arabic WordNet compared to other measures.
Also, the results show that PATH measure has the worst performance.
This thesis proposed a new semantic similarity measure using the taxonomy of Arabic WordNet.
The new measure takes three factors into account: depth of concepts in Arabic WordNet tree, distance between two compared concepts and information content of the least common concept that subsumed two compared concepts.
The weight of these factors can be adapted manually.
However, several experiments have been conducted to find the best weight that achieves the minimum MSE.
In order to evaluate the new measure, the Arabic dataset that used previously to evaluate the measures has been used to test the new measure.
Then, the results of applying new measure over Arabic WordNet have been compared with the results of the other measures.
However, the results showed that the new measure has achieved the highest correlation coefficient with human ratings (0.96), furthermore, the new measure has obtained a very good value of MSE (1.89%) compared with the other measures
التخصصات الرئيسية
تكنولوجيا المعلومات وعلم الحاسوب
الموضوعات
عدد الصفحات
87
قائمة المحتويات
Table of contents.
Abstract.
Abstract in Arabic.
Chapter One : Introduction.
Chapter Two : Literature review and related works.
Chapter Three : Experimental work and new proposed measure.
Chapter Four : Experimental results and measures evaluation.
Chapter Five : Conclusions and future work.
References.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
al-Dayri, Muhammad Ghandi. (2017). The semantic similarity measures using Arabic ontology. (Master's theses Theses and Dissertations Master). Middle East University, Jordan
https://search.emarefa.net/detail/BIM-762685
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
al-Dayri, Muhammad Ghandi. The semantic similarity measures using Arabic ontology. (Master's theses Theses and Dissertations Master). Middle East University. (2017).
https://search.emarefa.net/detail/BIM-762685
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
al-Dayri, Muhammad Ghandi. (2017). The semantic similarity measures using Arabic ontology. (Master's theses Theses and Dissertations Master). Middle East University, Jordan
https://search.emarefa.net/detail/BIM-762685
لغة النص
الإنجليزية
نوع البيانات
رسائل جامعية
رقم السجل
BIM-762685
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر