Estimating null value using rough set theory and bees algorithm

Other Title(s)

تخمين القيمة المفقودة باستخدام نظرية المجموعة الصارمة و خوارزمية النحل

Dissertant

Shakir, Samir Adil

Thesis advisor

Duaymi, Mahdi Kazaz.
al-Ubaydi, Ahmad Tariq Sadiq

University

University of Baghdad

Faculty

College of Science

Department

Department of Computer Science

University Country

Iraq

Degree

Master

Degree Date

2012

English Abstract

Most of the real world databases are characterized by an unavoidable problem of incompleteness.

This thesis deals with null values problem that caused randomly in the databases by the user entries.

It presents a hybrid approach for solving the problem.

It hybridizes rough set theory with a swarm intelligence algorithm.

The proposed approach is a supervised learning model.

A large set of complete data called learning data is used to find the decision rule sets that have already been used in solving the incomplete data problem.

The intelligent swarm algorithm is used for feature selection which represents bees algorithm as heuristic search algorithm combined with rough set theory as evaluation function.

Furthermore another feature selection algorithm called Interactive Dichotomizer 3 (ID3) is presented that works as statistical algorithm instead of intelligent algorithm.

The proposed approach uses rough approximation sets in writing the rules; it writes rules with confidence equals to one from the lower approximation sets and rules with confidence less than one from the upper approximation sets.

The left hand side conditional attributes of the rules are selected using Bees’s evaluation function in the intelligent algorithm or ID3’s selection function in the statistical algorithm.

Reduct is the minimal subset of attributes that enables the same classification as the whole set of attributes.

So, reduct has been used as a stopping condition for the proposed approach.

In other words stopping the selection and writing rules operations are employed while the set of left hand side attribute represents a reduct.

In order to assess the system and show the performance of intelligent approach for null values estimation, a comparison between intelligent and statistical approaches is made in three parameters; rule set size, rule set II complexity and accuracy of estimation through working with rough set theory.

The results obtained from most code sets show that bees algorithm is better than ID3 in decreasing the number of extracted rules without affecting the accuracy and increasing the accuracy ratio of null values estimation, especially when the number of null values is increasing.

By using the best code set and in the case of one null value occurrence within each record of testing data, the proposed system has the ability of estimating 95.9% of null values as an approximate estimation and 91.6% of them as a true estimation.

On the other hand, when two null values occur within each record, the null values approximately estimated are 92.7% while 78.08% of them are absolutely true estimates.

The proposed system is also tested in the case of three null values occurrence within each record.

In this case the null values approximately estimated are 84.8% and 73.61% of them are absolutely true estimates.

Main Subjects

Information Technology and Computer Science

No. of Pages

110

Table of Contents

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : Theoretical concepts.

Chapter Three : The proposed estimating null value approach.

Chapter Four : Experimental results.

Chapter Five : Conclusions and future work.

References.

American Psychological Association (APA)

Shakir, Samir Adil. (2012). Estimating null value using rough set theory and bees algorithm. (Master's theses Theses and Dissertations Master). University of Baghdad, Iraq
https://search.emarefa.net/detail/BIM-605415

Modern Language Association (MLA)

Shakir, Samir Adil. Estimating null value using rough set theory and bees algorithm. (Master's theses Theses and Dissertations Master). University of Baghdad. (2012).
https://search.emarefa.net/detail/BIM-605415

American Medical Association (AMA)

Shakir, Samir Adil. (2012). Estimating null value using rough set theory and bees algorithm. (Master's theses Theses and Dissertations Master). University of Baghdad, Iraq
https://search.emarefa.net/detail/BIM-605415

Language

English

Data Type

Arab Theses

Record ID

BIM-605415