Validation of fuzzy and crisp c-partitions

مقدم أطروحة جامعية

Hassar, Hisham

مشرف أطروحة جامعية

Bin Said, Amini

الجامعة

جامعة الأخوين

الكلية

كلية الهندسة و العلوم

القسم الأكاديمي

علوم الحاسب

دولة الجامعة

المغرب

الدرجة العلمية

ماجستير

تاريخ الدرجة العلمية

1998

الملخص الإنجليزي

Clustering has long been a popular approach to unsupervised pattern recognition.

One major concern for many clustering algorithms is that the number of clusters has to be known a priori.

For unlabeled data, there is no simple way to automatically determine the number c of clusters that best describe the sample data.

Traditionally, cluster validity measures have been used to evaluate crisp or fuzzy partitions for a range (of c values) on the number of clusters, in a hope to find the optimum number of clusters that best describe the natural underlying structure of the data.

A good validation of the clustering relies on a good representation of the "actual structure" of the data.

The concept of fuzzy sets offers special advantages in modeling the real structure of the data, and providing "overlap" and substructure information.

A wise use of those fuzziness properties in the cluster validity measures may be valuable in finding the right number of clusters c* in the data.

We propose in this thesis, a comparative study of crisp and fuzzy cluster validity measures; we focus mainly on the generalized Dunn’s indices (GDI).

Three fuzzy versions of the GDI indices are proposed: plain fuzzy GDI indices using directly the fiizzy membership degrees, GDI indices using fuzzy membership degrees of points that are found (after defuzzification) to belong crisply to a cluster, and fuzzy GDI indices combined with a measure of cluster fuzziness.

Two types of measures of cluster fuzziness are defined: (/) classical fuzziness measures H based on fuzzy membership grades of all data points in a cluster, and (t7) fiizziness measures that use only fuzzy membership values of data points found to belong crisply to the cluster.

Numerical experiments are conducted on nine data sets to compare the performance of the crisp and fuzzy GDI indices.

The best indices tested were found to be the crisp GDI indices, the "semi" fiizzy version of GDI indices.

These two indices resulted in 8 out of 9 correct prescriptions of the right number of clusters.

Further, a fiizzy GDI combined with measures of fuzziness produces 7 out of 9 correct prescriptions of the number of clusters.

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الموضوعات

عدد الصفحات

69

قائمة المحتويات

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : Crisp and fuzzy clustering and partitions.

Chapter Three : Cluster Validity.

Chapter Four : (Fuzzifying) the crisp Generalized Dunn's indices.

Chapter Five : Limit analysis study.

Chapter Six : Experiments and discussion.

Chapter Seven : Summary and conclusions.

References.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Hassar, Hisham. (1998). Validation of fuzzy and crisp c-partitions. (Master's theses Theses and Dissertations Master). Al Akhawayn University, Morocco
https://search.emarefa.net/detail/BIM-645906

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Hassar, Hisham. Validation of fuzzy and crisp c-partitions. (Master's theses Theses and Dissertations Master). Al Akhawayn University. (1998).
https://search.emarefa.net/detail/BIM-645906

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Hassar, Hisham. (1998). Validation of fuzzy and crisp c-partitions. (Master's theses Theses and Dissertations Master). Al Akhawayn University, Morocco
https://search.emarefa.net/detail/BIM-645906

لغة النص

الإنجليزية

نوع البيانات

رسائل جامعية

رقم السجل

BIM-645906