A data mining approach for categorising web documents
Dissertant
Thesis advisor
Yusuf, Nadiyah I.
Thabtah, Fadi
University
Philadelphia University
Faculty
Faculty of Information Technology
Department
Department of Computer Science
University Country
Jordan
Degree
Master
Degree Date
2008
English Abstract
-The core step of KDD is Data Mining.
Data Mining applies efficient algorithms to extract interesting patterns and regularities from the data.
As volume of information in digital form increases, the use of Text Categorization techniques, which aim at finding relevant information, becomes more necessary.
To improve the quality of the classification process form textual data sets, Associative Classification, which utilizes the association rule discovery techniques to construct classification systems, is evaluated in this thesis.
Particularly, we developed an associative classification vertical mining algorithm representation in order to improve the accuracy of the classification phase, and to reduce the size of the memory required to store intermediate TIDS in the mining process.
Considering the fact that vertical data structure supports fast frequency counting via intersection operations on transaction identifiers (TIDS), this should improve accuracy and decrease memory usage.
This thesis demonstrates the problem of using Associative Classification to solve Text Categorization problem, and utilize Diffset structure as a mining approach.
Main Subjects
Topics
No. of Pages
88
Table of Contents
Table of contents.
Abstract.
Chapter One : Introduction.
Chapter Two : Literature review : associative rule, and classification.
Chapter Three : Integrating classification and association rule mining.
Chapter Four : Vertical text categorization (VTC).
Chapter Five : Conclusions and future work.
References.
American Psychological Association (APA)
Qaddum, Kifayah Said. (2008). A data mining approach for categorising web documents. (Master's theses Theses and Dissertations Master). Philadelphia University, Jordan
https://search.emarefa.net/detail/BIM-546174
Modern Language Association (MLA)
Qaddum, Kifayah Said. A data mining approach for categorising web documents. (Master's theses Theses and Dissertations Master). Philadelphia University. (2008).
https://search.emarefa.net/detail/BIM-546174
American Medical Association (AMA)
Qaddum, Kifayah Said. (2008). A data mining approach for categorising web documents. (Master's theses Theses and Dissertations Master). Philadelphia University, Jordan
https://search.emarefa.net/detail/BIM-546174
Language
English
Data Type
Arab Theses
Record ID
BIM-546174