Sentences ordering approach for multi-document summarization in domain specific text document

العناوين الأخرى

تلخيص الوثائق المتعددة و ترتيب الجمل في مجال نصوص الوثائق

مقدم أطروحة جامعية

al-Nuaymi, Hamid Ali Husayn

مشرف أطروحة جامعية

al-Mashayikhi, Akram Uthman

أعضاء اللجنة

Kanan, Tariq
Kanan, Ghassan

الجامعة

جامعة عمان العربية

الكلية

كلية العلوم الحاسوبية و المعلوماتية

القسم الأكاديمي

قسم علم الحاسوب

دولة الجامعة

الأردن

الدرجة العلمية

ماجستير

تاريخ الدرجة العلمية

2016

الملخص الإنجليزي

In this thesis, three approach techniques are presented to produce sentence ordering summarization involving a novel graph summarization.

The 1st approach we applied the normalized importance score (TF-IDF threshold (tf =0.0) of sentence to compute based on different semantic similarity measure and semantic features (with cosine -normal, 0.4- train) to choose sentences with the most representation in the document.

Stack decoder algorithm (with summary length=100, sentence length=6) was used as a model and builds on it to create the summaries nearest to original document.

The 2nd approach the sentences are clustering based on (K-means clustering) semantic similarity score and selection that represent from all cluster that is involved in the created summary.

The 3rd approach is a novel graph formulation (with threshold=0.5) where it is generated on cliques found in the organized graph.

Graph is created to build the edges among sentences that have similar topics but not similar as semantically.

Linear combination of feature value is used as our importance function.

By training on DUC2002 data we calculate the weight for the feature value and apply them to get the score of the important sentence in the test data.

We apply this approach to produce 100 word summaries of a dataset available as part of DUC 2004 and discus the development of the system, analysis and algorithm.

Rouge score is used for performance evaluation of the system.

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

عدد الصفحات

95

قائمة المحتويات

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : General framework of the thesis.

Chapter Two : Related work.

Chapter Three : General framework of automatic summarization.

Chapter Four : Summarization methodology.

Chapter Five : Experiment and evaluation.

Chapter Six : Conclusion and future work.

References.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

al-Nuaymi, Hamid Ali Husayn. (2016). Sentences ordering approach for multi-document summarization in domain specific text document. (Master's theses Theses and Dissertations Master). Amman Arab University, Jordan
https://search.emarefa.net/detail/BIM-722658

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

al-Nuaymi, Hamid Ali Husayn. Sentences ordering approach for multi-document summarization in domain specific text document. (Master's theses Theses and Dissertations Master). Amman Arab University. (2016).
https://search.emarefa.net/detail/BIM-722658

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

al-Nuaymi, Hamid Ali Husayn. (2016). Sentences ordering approach for multi-document summarization in domain specific text document. (Master's theses Theses and Dissertations Master). Amman Arab University, Jordan
https://search.emarefa.net/detail/BIM-722658

لغة النص

الإنجليزية

نوع البيانات

رسائل جامعية

رقم السجل

BIM-722658