An improved BIRCH algorithm for breast cancer clustering

Other Title(s)

تصنيف مرض سرطان الثدي باستخدام خوارزمية (BIRCH)‎ المحسنة

Dissertant

Barham, Maysarah Muhammad Husayn

Thesis advisor

al-Zubi, Ahmad Ghazi

University

Middle East University

Faculty

Faculty of Information Technology

Department

Computer Science Department

University Country

Jordan

Degree

Master

Degree Date

2020

English Abstract

Breast cancer became a popular disease affects women over the world, but in most cases, treatment is possible when discovered early.

Screening tests play an important role in identifying tumors before they become cancerous, where diagnosis of breast cancer is more effective compared to other tests.

Over the past few decades, the computer-aided diagnosis of cancer has been the subject of research and achieved significant advances.

However, the automatic clustering and analysis of patients records in real-time is still a challenging task associated with the selection criteria of BIRCH parameters, and linkage and similarity metrics.

Clustering is an unsupervised machine learning technique used to group data elements without advance knowledge of group definitions.

Using aggregation algorithms for a large amount of data could lead to efficiency and accuracy problems.

In order to help specialists in making proper decisions while dealing with patients' records, we propose in this thesis work an improved version of the clustering algorithm called balanced iterative reducing and clustering using hierarchies (BIRCH).

This approach aims at transforming and clustering the medical records including the disease features into subclusters so that the similar features are grouped and analyzed.

The proposed improved BIRCH consists of four main components: features selection, features rescale, an efficient automatic threshold initialization, and empirical selection of linkage methods and distance metrics.

Specifically, the basic BIRCH clustering is fed with normalized selected features and automatic threshold value to control the tree-based sub-clustering as well as different linkage and similarity measures are involved.

The Breast Cancer Wisconsin dataset is used to evaluate the proposed algorithm.

The experimental results show that the improved BIRCH algorithm achieves a clustering accuracy of 97.7% during only 0.0004 seconds, which confirms its efficiency in helping doctors in analyzing the patients' records and making decisions.

Main Topic

Information Technology and Computer Science

Topics

No. of Pages

54

Table of Contents

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Study background and motivation.

Chapter Two : Related work.

Chapter Three : Methodology and proposed model.

Chapter Four : Experimental results and discussion.

Chapter Five : Conclusion and future work.

References.

American Psychological Association (APA)

Barham, Maysarah Muhammad Husayn. (2020). An improved BIRCH algorithm for breast cancer clustering. (Master's theses Theses and Dissertations Master). Middle East University, Jordan
https://search.emarefa.net/detail/BIM-970874

Modern Language Association (MLA)

Barham, Maysarah Muhammad Husayn. An improved BIRCH algorithm for breast cancer clustering. (Master's theses Theses and Dissertations Master). Middle East University. (2020).
https://search.emarefa.net/detail/BIM-970874

American Medical Association (AMA)

Barham, Maysarah Muhammad Husayn. (2020). An improved BIRCH algorithm for breast cancer clustering. (Master's theses Theses and Dissertations Master). Middle East University, Jordan
https://search.emarefa.net/detail/BIM-970874

Language

English

Data Type

Arab Theses

Record ID

BIM-970874