
An enhancement over BIRCH hierarchical clustering algorithms for better partitioning of medical data
مقدم أطروحة جامعية
مشرف أطروحة جامعية
الجامعة
جامعة الإسراء
الكلية
كلية تكنولوجيا المعلومات
القسم الأكاديمي
قسم هندسة البرمجيات
دولة الجامعة
الأردن
الدرجة العلمية
ماجستير
تاريخ الدرجة العلمية
2020
الملخص الإنجليزي
Over the years, technology has revolutionized our world and daily lives, information is getting to be more accessible and shared to the public users, big data across the web are being collected and saved in all forms from texts to different media files, machine learning algorithms are utilizing these data to learn more about it which in response, could improve these algorithms to be more useful and applicable in the real world, Clustering algorithms are unsupervised machine learning algorithms that can be used in many fields including pattern recognition and image analysis, There are many clustering algorithms such as K-means and Agglomerative Hierarchical Clustering (AHC), however they work fine in specific data sets.
Clustering algorithms can be used to cluster medical data to find an undiscovered pattern which in result improves the medical field’s knowledge about patients and different diseases, This thesis will focus on one of the most dangerous diseases cancer, SEER databases provides a big amount of data from the year of 1973 until now about cancer patients from various locations and sources throughout the United States, to find useful patterns through these data a good clustering algorithm is needed to cluster such big data, BIRCH is one of the most effective clustering algorithms on big data.
This thesis investigates the development of new technologies to propose the MD-BIRCH algorithm which is an enhanced version of BIRCH algorithm by implementing Manhattan distance over multiple phases of BIRCH algorithm from early stages of compacting data points into an initial Clustering Feature (CF) tree to the middle stages while descending the tree into more depth to the late stages of removing the outliers and performing global clustering on the whole tree by another modified clustering algorithm based on Manhattan distance.
The experiments have been conducted on SEER medical dataset over multiple clustering iterations, where each BIRCH and MD-BIRCH has been executed 8 times over cancer patients big data sample, the results showed that the MD-BIRCH algorithm has outperformed BIRCH algorithm in terms of quality and has a slightly an enhanced performance.
This work has been implemented by Python 3.7 programming language.
التخصصات الرئيسية
تكنولوجيا المعلومات وعلم الحاسوب
الموضوعات
عدد الصفحات
52
قائمة المحتويات
Table of contents.
Abstract.
Chapter One : Introduction.
Chapter Two : Literature review.
Chapter Three : Methodology.
Chapter Four : Design, analysis and implementation.
Chapter Five : Results.
Chapter Six : Conclusion and future work.
References.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
al-Nusur, Rad Muhammad Jamil. (2020). An enhancement over BIRCH hierarchical clustering algorithms for better partitioning of medical data. (Master's theses Theses and Dissertations Master). Isra University, Jordan
https://search.emarefa.net/detail/BIM-985129
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
al-Nusur, Rad Muhammad Jamil. An enhancement over BIRCH hierarchical clustering algorithms for better partitioning of medical data. (Master's theses Theses and Dissertations Master). Isra University. (2020).
https://search.emarefa.net/detail/BIM-985129
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
al-Nusur, Rad Muhammad Jamil. (2020). An enhancement over BIRCH hierarchical clustering algorithms for better partitioning of medical data. (Master's theses Theses and Dissertations Master). Isra University, Jordan
https://search.emarefa.net/detail/BIM-985129
لغة النص
الإنجليزية
نوع البيانات
رسائل جامعية
رقم السجل
BIM-985129
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي


تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر
