Distance and similarity measures effect on the performance of k-nearest neighbor classifier
Other Title(s)
تأثير مقاييس المسافة و التشابه على أداء مصنف أقرب جار
Dissertant
Thesis advisor
Comitee Members
al-Abadilah, Ahmad Hamad Hammud
Salman, Hamzah Iyal
al-Hasanat, Mahmud Bashir
University
Mutah University
Faculty
Information Technology College
Department
Computer Science Department
University Country
Jordan
Degree
Master
Degree Date
2017
English Abstract
The K-Nearest Neighbor (KNN) classifier is one of the simplest and most common classifiers, yet its performance competes with the most complex classifiers in the literature.
The core of this classifier depends mainly on measuring the distance or similarity between the tested example and the training examples.
This raises a major question about which distance measures to be used for the KNN classifier among a large number of distance and similarity measures? This thesis attempts to answer the previous question through evaluating the performance (measured by accuracy, precision and recall) of the KNN using a large number of distance measures, tested on a number of real world datasets, with and without adding different levels of noise. The experimental results show that the performance of KNN classifier depends significantly on the distance used, the results showed large gaps between the performances of different distances.
For example we found that Hassanat distance performed the best when applied on most datasets comparing to the other tested distances. In addition, the performance of the KNN degraded only about 20% while the noise level reaches 90%, this is true for all the distances used.
This means that the KNN classifier using any of the top 10 distances tolerate noise to a certain degree.
Moreover, the results show that some distances are less affected by the added noise comparing to other distances, for example we found that Hassanat distance performed the best when applied on most datasets under different levels of heavy noise
Main Subjects
Information Technology and Computer Science
Topics
No. of Pages
179
Table of Contents
Table of contents.
Abstract.
Abstract in Arabic.
Chapter One : Introduction.
Chapter Two : Literature review.
Chapter Three : Methodology.
Chapter Four : Results and conclusions.
References.
American Psychological Association (APA)
Abu al-Filat, Hanin Arafat. (2017). Distance and similarity measures effect on the performance of k-nearest neighbor classifier. (Master's theses Theses and Dissertations Master). Mutah University, Jordan
https://search.emarefa.net/detail/BIM-749301
Modern Language Association (MLA)
Abu al-Filat, Hanin Arafat. Distance and similarity measures effect on the performance of k-nearest neighbor classifier. (Master's theses Theses and Dissertations Master). Mutah University. (2017).
https://search.emarefa.net/detail/BIM-749301
American Medical Association (AMA)
Abu al-Filat, Hanin Arafat. (2017). Distance and similarity measures effect on the performance of k-nearest neighbor classifier. (Master's theses Theses and Dissertations Master). Mutah University, Jordan
https://search.emarefa.net/detail/BIM-749301
Language
English
Data Type
Arab Theses
Record ID
BIM-749301