Supervised keywords extraction and evaluation for Arabic text

Other Title(s)

استخراج و تقييم الكلمات المفتاحية لنصوص اللغة العربية باستخدام نهج التعلم

Thesis advisor

Ujan, Arafat

Comitee Members

Atum, Jalal Yusuf
Umar, Khamis
Ubayd, Nadim

University

Princess Sumaya University for Technology

Faculty

King Hussein Faculty for Computing Sciences

Department

Department of Computer Sciences

University Country

Jordan

Degree

Master

Degree Date

2014

English Abstract

Keywords are phrases, consisting of one or more words ,that describe the meaning of document.

Keyword extraction is a process of extracting these phrases, it is considered as core technology of many automatic processing task such as text summarization, automatic indexing. Many algorithms have been implemented to solve the problem of text keywords extraction .

Most of the work in this area was carried out for the English text and other European languages, this research describes a method for extracting keywords from Arabic documents.

The method identifies the keywords by combining linguistics, statistical analysis of the text and supervised learning technique using the SVM - support vector machine.

The Arabic documents are preprocessed by applying tokenization , stemming , stop word removal ,calculation the frequencies and n-gram , extraction before using SVM classifier that determines the final list of keywords. We considered the keyword extraction problem as a classification problem for the words every word in the text has a label (key or not key) Experimental results indicate that the proposed SVM based method can significantly outperform the baseline methods for Arabic Text keyword extraction.

Main Subjects

Information Technology and Computer Science

Topics

No. of Pages

95

Table of Contents

Table of contents.

Abstract.

Chapter One : Introduction.

Chapter Two : Literatures review.

Chapter Three : System framework.

Chapter Four : LIBSVM implementation.

Chapter Five : Experimental results.

Chapter Six : Conclusion and future work.

References.

American Psychological Association (APA)

(2014). Supervised keywords extraction and evaluation for Arabic text. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-413750

Modern Language Association (MLA)

Supervised keywords extraction and evaluation for Arabic text. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology. (2014).
https://search.emarefa.net/detail/BIM-413750

American Medical Association (AMA)

(2014). Supervised keywords extraction and evaluation for Arabic text. (Master's theses Theses and Dissertations Master). Princess Sumaya University for Technology, Jordan
https://search.emarefa.net/detail/BIM-413750

Language

English

Data Type

Arab Theses

Record ID

BIM-413750