Automatic Arabic text summarization
Other Title(s)
تلخيص تلقائي للنصوص العربية
Dissertant
Thesis advisor
Comitee Members
al-Hulays, Ala Mustafa
Abu Nasir, Sami Salim
University
Islamic University
Faculty
Faculty of Engineering
Department
Department of Computer Engineering
University Country
Palestine (Gaza Strip)
Degree
Master
Degree Date
2015
English Abstract
Arabic language is one of the most famous languages in the world; its importance comes from being the fifth language that has native speakers in the world.
Creating a good summary of the text is one of the most important things in the linguistics because it gives the user the most important paragraphs in the text that he wants to read.
There are some techniques to summarize Arabic language, but they are still little and need to be improved.
One approach that are used in text summarization is graph based but it still need enhacment.
This thesis builds a new algorithm called GBATSS (Graph Based Arabic Text Summarizer) to summarize Arabic text depending on NLP and Google page rank algorithm.
The system works on three basic units.
These units are rooted stem, light stem, and finally no-stem.
The system depends on compression ratio of 40 %.
The process of summarization is done in 12 stages start from data collection, text preprocessing, text normalization, text tokenization, stemming, stop words removal, building graph, calculating edge weighting, applying page rank, and finally extracting the summary.
Finally, we tested the system using EASC data set and using the recall, precision and f-measure for evaluation process.
The results show that the using of rooted-stem as a basic unit gives the best results then no-stem and finally light-stem
Main Subjects
Information Technology and Computer Science
No. of Pages
108
Table of Contents
Table of contents.
Abstract.
Abstract in Arabic.
Chapter One : Introduction.
Chapter Two : Related work.
Chapter Three : Theoretical background.
Chapter Four : Methodology.
Chapter Five : Results, evaluation and discussion.
Chapter Six : Conclusion and future work.
References.
American Psychological Association (APA)
al-Farra, Iyad Jihad Rafiq. (2015). Automatic Arabic text summarization. (Master's theses Theses and Dissertations Master). Islamic University, Palestine (Gaza Strip)
https://search.emarefa.net/detail/BIM-727106
Modern Language Association (MLA)
al-Farra, Iyad Jihad Rafiq. Automatic Arabic text summarization. (Master's theses Theses and Dissertations Master). Islamic University. (2015).
https://search.emarefa.net/detail/BIM-727106
American Medical Association (AMA)
al-Farra, Iyad Jihad Rafiq. (2015). Automatic Arabic text summarization. (Master's theses Theses and Dissertations Master). Islamic University, Palestine (Gaza Strip)
https://search.emarefa.net/detail/BIM-727106
Language
English
Data Type
Arab Theses
Record ID
BIM-727106