Multi methods for text summarization based on learning technique

Dissertant

al-Zuhairy, Nur Amjad Hassan

University

University of Technology

Faculty

-

Department

Computer Sciences Department

University Country

Iraq

Degree

Master

Degree Date

2008

English Abstract

Text summarization is a computer program that summarizes a text.

The summarizer removes redundant information from the input text and produces a shorter non-redundant output text.

The output is an extract from the original text.

Automatic text summarization is extremely useful in combination with a search engine on the Web.

Text summarization can also be used to summarize a text before an automatic speech synthesizer reads it ; thus reducing the time needed to absorb the essential parts of a document.

The proposed system is composed of several stages to generate summary, it can be abstracted in three main disciplines that are integrated to give accurate results : Statistical methods, linguistic approaches using natural language processing and machine learning using association rule.

This system used for integrating text summarization and data mining, text summarization benefits DM by extracting structured data from textual documents, which can then be mined using traditional method, The proposed system does two tasks at the same lime, the first task is to build the database of relational words by learning them from the several input texts, the relations of each word effect in its weight that the system depends on to select the token which must be in summary text, and the second task is to generate a summary text.

The output of the proposed system is an extract from the original text and it saves the meaning of the original text.

The number of sentences of output text is {25-40) % from the number of sentences of the original text.

The proposed system experiments show good results arc obtained of about 95 % of text summarization compared with text summarized manually by human.

Main Subjects

Information Technology and Computer Science

Topics

American Psychological Association (APA)

al-Zuhairy, Nur Amjad Hassan. (2008). Multi methods for text summarization based on learning technique. (Master's theses Theses and Dissertations Master). University of Technology, Iraq
https://search.emarefa.net/detail/BIM-305385

Modern Language Association (MLA)

al-Zuhairy, Nur Amjad Hassan. Multi methods for text summarization based on learning technique. (Master's theses Theses and Dissertations Master). University of Technology. (2008).
https://search.emarefa.net/detail/BIM-305385

American Medical Association (AMA)

al-Zuhairy, Nur Amjad Hassan. (2008). Multi methods for text summarization based on learning technique. (Master's theses Theses and Dissertations Master). University of Technology, Iraq
https://search.emarefa.net/detail/BIM-305385

Language

English

Data Type

Arab Theses

Record ID

BIM-305385