Design and implementation of text-based information retrieval system using fuzzy logic

Other Title(s)

تصميم و تطبيق منظومة استرجاع للمعلومات النصية بالاعتماد على المنطق المضبب

Dissertant

Hamid, Ibrahim Amir

Thesis advisor

Jurj, Luayy Adwar

University

University of Baghdad

Faculty

College of Science

Department

Department of Computer Science

University Country

Iraq

Degree

Master

Degree Date

2011

English Abstract

Since Web is a rich source of information, it becomes necessary to invent assistance systems to efficiently and automatically search, and nominate for more suitable and convenient Web page documents.

This kind of tools will help the user to reach its needs on the Web media without wasting time and effort.

The purpose of this study is to construct an automated information retrieval system that helps to retrieve the HTML documents that have a textual content similar to any document chosen by a user.

The textual content in every electronic repository (including the Web) is statistically variable and has complex behavior.

It is noticed that the classical criteria for similarity don’t give an encouraging results in terms of accuracy and rationality metrics.

So, in the proposed system a logical paradigm is developed which fundamentally depends on the fuzzy logic criteria to soften the decisions of the matching between the textual contents and makes the results more rational and more reasonable for the user.

The proposed system composed of two phases the enrollment phase and the retrieval phase.

In the enrollment the extracted keywords from documents are stored along with their textual features, the Web documents are preprocessed using a sequence of text mining operations and then a useful knowledge (i.e., the set of keywords) is extracted and deposited in a dedicated database.

At the retrieval phase, when the client issues a query document request the system matches this query document with every document stored in the database (after formulating the query document to a format compatible with the indexed feature vectors).

As a matching result is a score value is given which depends on the number of common keywords found in both matched documents and on the degree of relative significance of these common keywords in both documents.

Each matching instance is fuzzyfied using an s-shape membership function.

Then by applying the II criteria " the highest the match score the best the match"(due to matching the query with all documents pre-deposited in the database) is sorted in descending order, in order to nominate the top listed documents as a query results.

The results of the conducted performance tests showed that the usage of fuzzy logic soften the matching decisions and lead to better results than using the traditional hard computing techniques.

The best found precision and recall values were 0.99 and 0.66, respectively.

Main Subjects

Information Technology and Computer Science

Topics

No. of Pages

126

Table of Contents

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : General introduction.

Chapter Two : Theoretical background.

Chapter Three : TBIR system.

Chapter Four : Test results.

Chapter Five : Conclusion and future work.

References.

American Psychological Association (APA)

Hamid, Ibrahim Amir. (2011). Design and implementation of text-based information retrieval system using fuzzy logic. (Master's theses Theses and Dissertations Master). University of Baghdad, Iraq
https://search.emarefa.net/detail/BIM-605672

Modern Language Association (MLA)

Hamid, Ibrahim Amir. Design and implementation of text-based information retrieval system using fuzzy logic. (Master's theses Theses and Dissertations Master). University of Baghdad. (2011).
https://search.emarefa.net/detail/BIM-605672

American Medical Association (AMA)

Hamid, Ibrahim Amir. (2011). Design and implementation of text-based information retrieval system using fuzzy logic. (Master's theses Theses and Dissertations Master). University of Baghdad, Iraq
https://search.emarefa.net/detail/BIM-605672

Language

English

Data Type

Arab Theses

Record ID

BIM-605672