Efficient high dimension data clustering using constraint-partitioning K-means algorithm

المؤلف

Jurj, Aloysius

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 10، العدد 6 (30 نوفمبر/تشرين الثاني 2013)10ص.

الناشر

جامعة الزرقاء

تاريخ النشر

2013-11-30

دولة النشر

الأردن

عدد الصفحات

10

التخصصات الرئيسية

الإعلام و الاتصال

الموضوعات

الملخص EN

with the ever increasing size of data, clustering of large dimensional databases poses a demanding task that should satisfy both the requirements of the computation efficiency and result quality.

In order to achieve both tasks, clustering of feature space rather than the original data space has received importance among the data mining researchers.

Accordingly, we performed data clustering of high dimension dataset using Constraint-Partitioning K-Means clustering algorithm which did not fit properly to cluster high dimensional data sets in terms of effectiveness and efficiency, because of the intrinsic sparse of high dimensional data and resulted in producing indefinite and inaccurate clusters.

Hence, we carry out two steps for clustering high dimension dataset.

Initially, we perform dimensionality reduction on the high dimension dataset using Principal Component Analysis as a preprocessing step to data clustering.

Later, we integrate the Constraint-Partitioning KMeans clustering algorithm to the dimension reduced dataset to produce good and accurate clusters.

The performance of the approach is evaluated with high dimensional datasets such as Parkinson’s dataset and Ionosphere dataset.

The experimental results showed that the proposed approach is very effective in producing accurate and precise clusters.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Jurj, Aloysius. 2013. Efficient high dimension data clustering using constraint-partitioning K-means algorithm. The International Arab Journal of Information Technology،Vol. 10, no. 6.
https://search.emarefa.net/detail/BIM-311839

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Jurj, Aloysius. Efficient high dimension data clustering using constraint-partitioning K-means algorithm. The International Arab Journal of Information Technology Vol. 10, no. 6 (Nov. 2013).
https://search.emarefa.net/detail/BIM-311839

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Jurj, Aloysius. Efficient high dimension data clustering using constraint-partitioning K-means algorithm. The International Arab Journal of Information Technology. 2013. Vol. 10, no. 6.
https://search.emarefa.net/detail/BIM-311839

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references.

رقم السجل

BIM-311839