Distance and similarity measures effect on the performance of k-nearest neighbor classifier

العناوين الأخرى

تأثير مقاييس المسافة و التشابه على أداء مصنف أقرب جار

مقدم أطروحة جامعية

Abu al-Filat, Hanin Arafat

مشرف أطروحة جامعية

al-Hasanat, Ahmad Bashir A.

أعضاء اللجنة

al-Abadilah, Ahmad Hamad Hammud
Salman, Hamzah Iyal
al-Hasanat, Mahmud Bashir

الجامعة

جامعة مؤتة

الكلية

كلية تكنولوجيا المعلومات

القسم الأكاديمي

قسم الحاسوب

دولة الجامعة

الأردن

الدرجة العلمية

ماجستير

تاريخ الدرجة العلمية

2017

الملخص الإنجليزي

The K-Nearest Neighbor (KNN) classifier is one of the simplest and most common classifiers, yet its performance competes with the most complex classifiers in the literature.

The core of this classifier depends mainly on measuring the distance or similarity between the tested example and the training examples.

This raises a major question about which distance measures to be used for the KNN classifier among a large number of distance and similarity measures? This thesis attempts to answer the previous question through evaluating the performance (measured by accuracy, precision and recall) of the KNN using a large number of distance measures, tested on a number of real world datasets, with and without adding different levels of noise. The experimental results show that the performance of KNN classifier depends significantly on the distance used, the results showed large gaps between the performances of different distances.

For example we found that Hassanat distance performed the best when applied on most datasets comparing to the other tested distances. In addition, the performance of the KNN degraded only about 20% while the noise level reaches 90%, this is true for all the distances used.

This means that the KNN classifier using any of the top 10 distances tolerate noise to a certain degree.

Moreover, the results show that some distances are less affected by the added noise comparing to other distances, for example we found that Hassanat distance performed the best when applied on most datasets under different levels of heavy noise

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الموضوعات

عدد الصفحات

179

قائمة المحتويات

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : Literature review.

Chapter Three : Methodology.

Chapter Four : Results and conclusions.

References.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Abu al-Filat, Hanin Arafat. (2017). Distance and similarity measures effect on the performance of k-nearest neighbor classifier. (Master's theses Theses and Dissertations Master). Mutah University, Jordan
https://search.emarefa.net/detail/BIM-749301

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Abu al-Filat, Hanin Arafat. Distance and similarity measures effect on the performance of k-nearest neighbor classifier. (Master's theses Theses and Dissertations Master). Mutah University. (2017).
https://search.emarefa.net/detail/BIM-749301

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Abu al-Filat, Hanin Arafat. (2017). Distance and similarity measures effect on the performance of k-nearest neighbor classifier. (Master's theses Theses and Dissertations Master). Mutah University, Jordan
https://search.emarefa.net/detail/BIM-749301

لغة النص

الإنجليزية

نوع البيانات

رسائل جامعية

رقم السجل

BIM-749301