Using unlabeled data set for mining know from distributed data base system

Dissertant

al-Zubaydi, al-Azhar Haih

University

University of Technology

Faculty

-

Department

Computer Sciences Department

University Country

Iraq

Degree

Master

Degree Date

2008

English Abstract

The need to understand large, complex, information-rich data sets is common to virtually all fields of business, science, and engineering and in the business world invite to discovering more activate approach to deal with this data, corporate and customer data are becoming recognized as a strategic asset.

In this thesis tries to extract the most information can be from the data available, this is by suggest technique called clustering for classification and formalize this technique in the two environments of distributed database system known as homogeneous and heterogenous distributed database systems.

In this thesis two algorithms are introduced to describe and compare the applying of the proposed technique to the two types of distributed database systems The First Proposed Algorithm is : Homogeneous Distributed Clustering For Classification (HOMDCFC) Algorithm- try to learning a classification model from unlabeled datasets distributed homogenously over network, this by building a local clustering model on the datasets distributed over three sites in the network and then build a local classification model based on labeled data that produce from clustering model, in the one computer considered as control computer a global classification model is built and then use this model in the future predictive. Second Proposed Algorithm : Heterogeneous Distributed Clustering For Classification (HH1DCFC) Algorithm ; that try to build a classification model over unlabeled datasets distributed heterogeneously over sites of network, the datasets in this algorithm collected in one centra! computer and then build the clustering model and then classification model. In this work a comparison of these two proposed algorithm is introduced and show the different in the criteria, accuracy of the produced classifier, time spent in exe CUtion, cost require for central storage.

Main Subjects

Information Technology and Computer Science

Topics

American Psychological Association (APA)

al-Zubaydi, al-Azhar Haih. (2008). Using unlabeled data set for mining know from distributed data base system. (Master's theses Theses and Dissertations Master). University of Technology, Iraq
https://search.emarefa.net/detail/BIM-305378

Modern Language Association (MLA)

al-Zubaydi, al-Azhar Haih. Using unlabeled data set for mining know from distributed data base system. (Master's theses Theses and Dissertations Master). University of Technology. (2008).
https://search.emarefa.net/detail/BIM-305378

American Medical Association (AMA)

al-Zubaydi, al-Azhar Haih. (2008). Using unlabeled data set for mining know from distributed data base system. (Master's theses Theses and Dissertations Master). University of Technology, Iraq
https://search.emarefa.net/detail/BIM-305378

Language

English

Data Type

Arab Theses

Record ID

BIM-305378