A data mining approach for categorising web documents

Dissertant

Qaddum, Kifayah Said

Thesis advisor

Yusuf, Nadiyah I.
Thabtah, Fadi

University

Philadelphia University

Faculty

Faculty of Information Technology

Department

Department of Computer Science

University Country

Jordan

Degree

Master

Degree Date

2008

English Abstract

-The core step of KDD is Data Mining.

Data Mining applies efficient algorithms to extract interesting patterns and regularities from the data.

As volume of information in digital form increases, the use of Text Categorization techniques, which aim at finding relevant information, becomes more necessary.

To improve the quality of the classification process form textual data sets, Associative Classification, which utilizes the association rule discovery techniques to construct classification systems, is evaluated in this thesis.

Particularly, we developed an associative classification vertical mining algorithm representation in order to improve the accuracy of the classification phase, and to reduce the size of the memory required to store intermediate TIDS in the mining process.

Considering the fact that vertical data structure supports fast frequency counting via intersection operations on transaction identifiers (TIDS), this should improve accuracy and decrease memory usage.

This thesis demonstrates the problem of using Associative Classification to solve Text Categorization problem, and utilize Diffset structure as a mining approach.

Main Subjects

Mathematics

Topics

No. of Pages

88

Table of Contents

Table of contents.

Abstract.

Chapter One : Introduction.

Chapter Two : Literature review : associative rule, and classification.

Chapter Three : Integrating classification and association rule mining.

Chapter Four : Vertical text categorization (VTC).

Chapter Five : Conclusions and future work.

References.

American Psychological Association (APA)

Qaddum, Kifayah Said. (2008). A data mining approach for categorising web documents. (Master's theses Theses and Dissertations Master). Philadelphia University, Jordan
https://search.emarefa.net/detail/BIM-546174

Modern Language Association (MLA)

Qaddum, Kifayah Said. A data mining approach for categorising web documents. (Master's theses Theses and Dissertations Master). Philadelphia University. (2008).
https://search.emarefa.net/detail/BIM-546174

American Medical Association (AMA)

Qaddum, Kifayah Said. (2008). A data mining approach for categorising web documents. (Master's theses Theses and Dissertations Master). Philadelphia University, Jordan
https://search.emarefa.net/detail/BIM-546174

Language

English

Data Type

Arab Theses

Record ID

BIM-546174