Design and implementation of text-based information retrieval system using fuzzy logic
Other Title(s)
تصميم و تطبيق منظومة استرجاع للمعلومات النصية بالاعتماد على المنطق المضبب
Dissertant
Thesis advisor
University
University of Baghdad
Faculty
College of Science
Department
Department of Computer Science
University Country
Iraq
Degree
Master
Degree Date
2011
English Abstract
Since Web is a rich source of information, it becomes necessary to invent assistance systems to efficiently and automatically search, and nominate for more suitable and convenient Web page documents.
This kind of tools will help the user to reach its needs on the Web media without wasting time and effort.
The purpose of this study is to construct an automated information retrieval system that helps to retrieve the HTML documents that have a textual content similar to any document chosen by a user.
The textual content in every electronic repository (including the Web) is statistically variable and has complex behavior.
It is noticed that the classical criteria for similarity don’t give an encouraging results in terms of accuracy and rationality metrics.
So, in the proposed system a logical paradigm is developed which fundamentally depends on the fuzzy logic criteria to soften the decisions of the matching between the textual contents and makes the results more rational and more reasonable for the user.
The proposed system composed of two phases the enrollment phase and the retrieval phase.
In the enrollment the extracted keywords from documents are stored along with their textual features, the Web documents are preprocessed using a sequence of text mining operations and then a useful knowledge (i.e., the set of keywords) is extracted and deposited in a dedicated database.
At the retrieval phase, when the client issues a query document request the system matches this query document with every document stored in the database (after formulating the query document to a format compatible with the indexed feature vectors).
As a matching result is a score value is given which depends on the number of common keywords found in both matched documents and on the degree of relative significance of these common keywords in both documents.
Each matching instance is fuzzyfied using an s-shape membership function.
Then by applying the II criteria " the highest the match score the best the match"(due to matching the query with all documents pre-deposited in the database) is sorted in descending order, in order to nominate the top listed documents as a query results.
The results of the conducted performance tests showed that the usage of fuzzy logic soften the matching decisions and lead to better results than using the traditional hard computing techniques.
The best found precision and recall values were 0.99 and 0.66, respectively.
Main Subjects
Information Technology and Computer Science
Topics
- Mathematical models
- Set theory
- Data processing
- Word processing
- Information retrieval
- Data mining
- Internet searching
- Fuzzy logic
No. of Pages
126
Table of Contents
Table of contents.
Abstract.
Abstract in Arabic.
Chapter One : General introduction.
Chapter Two : Theoretical background.
Chapter Three : TBIR system.
Chapter Four : Test results.
Chapter Five : Conclusion and future work.
References.
American Psychological Association (APA)
Hamid, Ibrahim Amir. (2011). Design and implementation of text-based information retrieval system using fuzzy logic. (Master's theses Theses and Dissertations Master). University of Baghdad, Iraq
https://search.emarefa.net/detail/BIM-605672
Modern Language Association (MLA)
Hamid, Ibrahim Amir. Design and implementation of text-based information retrieval system using fuzzy logic. (Master's theses Theses and Dissertations Master). University of Baghdad. (2011).
https://search.emarefa.net/detail/BIM-605672
American Medical Association (AMA)
Hamid, Ibrahim Amir. (2011). Design and implementation of text-based information retrieval system using fuzzy logic. (Master's theses Theses and Dissertations Master). University of Baghdad, Iraq
https://search.emarefa.net/detail/BIM-605672
Language
English
Data Type
Arab Theses
Record ID
BIM-605672