Distance and similarity measures effect on the performance of k-nearest neighbor classifier

Other Title(s)

تأثير مقاييس المسافة و التشابه على أداء مصنف أقرب جار

Dissertant

Abu al-Filat, Hanin Arafat

Thesis advisor

al-Hasanat, Ahmad Bashir A.

Comitee Members

al-Abadilah, Ahmad Hamad Hammud
Salman, Hamzah Iyal
al-Hasanat, Mahmud Bashir

University

Mutah University

Faculty

Information Technology College

Department

Computer Science Department

University Country

Jordan

Degree

Master

Degree Date

2017

English Abstract

The K-Nearest Neighbor (KNN) classifier is one of the simplest and most common classifiers, yet its performance competes with the most complex classifiers in the literature.

The core of this classifier depends mainly on measuring the distance or similarity between the tested example and the training examples.

This raises a major question about which distance measures to be used for the KNN classifier among a large number of distance and similarity measures? This thesis attempts to answer the previous question through evaluating the performance (measured by accuracy, precision and recall) of the KNN using a large number of distance measures, tested on a number of real world datasets, with and without adding different levels of noise. The experimental results show that the performance of KNN classifier depends significantly on the distance used, the results showed large gaps between the performances of different distances.

For example we found that Hassanat distance performed the best when applied on most datasets comparing to the other tested distances. In addition, the performance of the KNN degraded only about 20% while the noise level reaches 90%, this is true for all the distances used.

This means that the KNN classifier using any of the top 10 distances tolerate noise to a certain degree.

Moreover, the results show that some distances are less affected by the added noise comparing to other distances, for example we found that Hassanat distance performed the best when applied on most datasets under different levels of heavy noise

Main Subjects

Information Technology and Computer Science

Topics

No. of Pages

179

Table of Contents

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : Literature review.

Chapter Three : Methodology.

Chapter Four : Results and conclusions.

References.

American Psychological Association (APA)

Abu al-Filat, Hanin Arafat. (2017). Distance and similarity measures effect on the performance of k-nearest neighbor classifier. (Master's theses Theses and Dissertations Master). Mutah University, Jordan
https://search.emarefa.net/detail/BIM-749301

Modern Language Association (MLA)

Abu al-Filat, Hanin Arafat. Distance and similarity measures effect on the performance of k-nearest neighbor classifier. (Master's theses Theses and Dissertations Master). Mutah University. (2017).
https://search.emarefa.net/detail/BIM-749301

American Medical Association (AMA)

Abu al-Filat, Hanin Arafat. (2017). Distance and similarity measures effect on the performance of k-nearest neighbor classifier. (Master's theses Theses and Dissertations Master). Mutah University, Jordan
https://search.emarefa.net/detail/BIM-749301

Language

English

Data Type

Arab Theses

Record ID

BIM-749301