Supervised keywords extraction and evaluation for Arabic text

العناوين الأخرى

استخراج و تقييم الكلمات المفتاحية لنصوص اللغة العربية باستخدام نهج التعلم

مشرف أطروحة جامعية

Ujan, Arafat

أعضاء اللجنة

Atum, Jalal Yusuf
Umar, Khamis
Ubayd, Nadim

الجامعة

جامعة الأميرة سمية للتكنولوجيا

الكلية

كلية الملك الحسين لعلوم الحوسبة

القسم الأكاديمي

قسم علم الحاسوب

دولة الجامعة

الأردن

الدرجة العلمية

ماجستير

تاريخ الدرجة العلمية

2014

الملخص الإنجليزي

Keywords are phrases, consisting of one or more words ,that describe the meaning of document.

Keyword extraction is a process of extracting these phrases, it is considered as core technology of many automatic processing task such as text summarization, automatic indexing. Many algorithms have been implemented to solve the problem of text keywords extraction .

Most of the work in this area was carried out for the English text and other European languages, this research describes a method for extracting keywords from Arabic documents.

The method identifies the keywords by combining linguistics, statistical analysis of the text and supervised learning technique using the SVM - support vector machine.

The Arabic documents are preprocessed by applying tokenization , stemming , stop word removal ,calculation the frequencies and n-gram , extraction before using SVM classifier that determines the final list of keywords. We considered the keyword extraction problem as a classification problem for the words every word in the text has a label (key or not key) Experimental results indicate that the proposed SVM based method can significantly outperform the baseline methods for Arabic Text keyword extraction.

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الموضوعات

عدد الصفحات

95

قائمة المحتويات

Table of contents.

Abstract.

Chapter One : Introduction.

Chapter Two : Literatures review.

Chapter Three : System framework.

Chapter Four : LIBSVM implementation.

Chapter Five : Experimental results.

Chapter Six : Conclusion and future work.

References.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

(2014). Supervised keywords extraction and evaluation for Arabic text. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-413750

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Supervised keywords extraction and evaluation for Arabic text. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology. (2014).
https://search.emarefa.net/detail/BIM-413750

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

(2014). Supervised keywords extraction and evaluation for Arabic text. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-413750

لغة النص

الإنجليزية

نوع البيانات

رسائل جامعية

رقم السجل

BIM-413750