Using unlabeled data set for mining know from distributed data base system
Dissertant
University
University of Technology
Faculty
-
Department
Computer Sciences Department
University Country
Iraq
Degree
Master
Degree Date
2008
English Abstract
The need to understand large, complex, information-rich data sets is common to virtually all fields of business, science, and engineering and in the business world invite to discovering more activate approach to deal with this data, corporate and customer data are becoming recognized as a strategic asset.
In this thesis tries to extract the most information can be from the data available, this is by suggest technique called clustering for classification and formalize this technique in the two environments of distributed database system known as homogeneous and heterogenous distributed database systems.
In this thesis two algorithms are introduced to describe and compare the applying of the proposed technique to the two types of distributed database systems The First Proposed Algorithm is : Homogeneous Distributed Clustering For Classification (HOMDCFC) Algorithm- try to learning a classification model from unlabeled datasets distributed homogenously over network, this by building a local clustering model on the datasets distributed over three sites in the network and then build a local classification model based on labeled data that produce from clustering model, in the one computer considered as control computer a global classification model is built and then use this model in the future predictive. Second Proposed Algorithm : Heterogeneous Distributed Clustering For Classification (HH1DCFC) Algorithm ; that try to build a classification model over unlabeled datasets distributed heterogeneously over sites of network, the datasets in this algorithm collected in one centra! computer and then build the clustering model and then classification model. In this work a comparison of these two proposed algorithm is introduced and show the different in the criteria, accuracy of the produced classifier, time spent in exe CUtion, cost require for central storage.
Main Subjects
Information Technology and Computer Science
Topics
American Psychological Association (APA)
al-Zubaydi, al-Azhar Haih. (2008). Using unlabeled data set for mining know from distributed data base system. (Master's theses Theses and Dissertations Master). University of Technology, Iraq
https://search.emarefa.net/detail/BIM-305378
Modern Language Association (MLA)
al-Zubaydi, al-Azhar Haih. Using unlabeled data set for mining know from distributed data base system. (Master's theses Theses and Dissertations Master). University of Technology. (2008).
https://search.emarefa.net/detail/BIM-305378
American Medical Association (AMA)
al-Zubaydi, al-Azhar Haih. (2008). Using unlabeled data set for mining know from distributed data base system. (Master's theses Theses and Dissertations Master). University of Technology, Iraq
https://search.emarefa.net/detail/BIM-305378
Language
English
Data Type
Arab Theses
Record ID
BIM-305378