A Hadoop MapReduce Implementation of C5.0 decision tree algorithm

Other Title(s)

إنشاء و تطبيق خوارزمية شجرة القرارات "C5.0" باستخدام "Hadoop MapReduce"

Dissertant

Abu Labbad, Mamun Fawwaz.

Thesis advisor

al-Sharbatji, Bassam

University

Middle East University

Faculty

Faculty of Information Technology

Department

Computer Science Department

University Country

Jordan

Degree

Master

Degree Date

2020

English Abstract

Recently, many of the research institutes have been involving in boosting the accuracy and efficiency of different classification techniques.

To date, a lot of enhancement efforts are spent in order to boost such techniques.

In addition, the growing volume of data produced daily raises more issues that need to be resolved, which presents risks to the standard Decision Tree (DT) algorithms.

Likewise, the process of generation DT is complicated and is time-consuming to complete the computation on one machine when the size of the datasets becomes big, and as the data can not keep the whole training dataset or most of it in memory on one machine.

Some computations are transferred to the additional storage, which will lead to increasing the cost of input or output.

In this thesis, the researcher will implement a standard DT algorithm C5.0 using Hadoop MapReduce and will compare the error-rate, leaf nodes, and rules with C4.5.

The procedure used in this thesis is to transform the standard algorithm into steps of Map and reduce.

In addition to implementing data structures to reduce the cost of communication and to proceed with comprehensive experiments on a vast dataset.

The results of the study revealed that the MapReduce C5.0 tree is a fixed memory issue to enhance the execution time of the algorithm, and it is suitable for enormous data.

The algorithm is characterized by being expandable in the cluster environment and is also characterized by time efficiency.

Main Topic

Information Technology and Computer Science

Topics

No. of Pages

36

Table of Contents

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : Theoretical background and related works.

Chapter Three : Methodology and the proposed approach.

Chapter Four : Experimental design and results.

Chapter Five : Conclusion and future work.

References.

American Psychological Association (APA)

Abu Labbad, Mamun Fawwaz.. (2020). A Hadoop MapReduce Implementation of C5.0 decision tree algorithm. (Master's theses Theses and Dissertations Master). Middle East University, Jordan
https://search.emarefa.net/detail/BIM-970873

Modern Language Association (MLA)

Abu Labbad, Mamun Fawwaz.. A Hadoop MapReduce Implementation of C5.0 decision tree algorithm. (Master's theses Theses and Dissertations Master). Middle East University. (2020).
https://search.emarefa.net/detail/BIM-970873

American Medical Association (AMA)

Abu Labbad, Mamun Fawwaz.. (2020). A Hadoop MapReduce Implementation of C5.0 decision tree algorithm. (Master's theses Theses and Dissertations Master). Middle East University, Jordan
https://search.emarefa.net/detail/BIM-970873

Language

English

Data Type

Arab Theses

Record ID

BIM-970873