Self-Adaptive K-Means Based on a Covering Algorithm

Publication Date

2018-08-01

Country of Publication

Egypt

No. of Pages

Main Subjects

Philosophy

Abstract EN

The K-means algorithm is one of the ten classic algorithms in the area of data mining and has been studied by researchers in numerous fields for a long time.

However, the value of the clustering number k in the K-means algorithm is not always easy to be determined, and the selection of the initial centers is vulnerable to outliers.

This paper proposes an improved K-means clustering algorithm called the covering K-means algorithm (C-K-means).

The C-K-means algorithm can not only acquire efficient and accurate clustering results but also self-adaptively provide a reasonable numbers of clusters based on the data features.

It includes two phases: the initialization of the covering algorithm (CA) and the Lloyd iteration of the K-means.

The first phase executes the CA.

CA self-organizes and recognizes the number of clusters k based on the similarities in the data, and it requires neither the number of clusters to be prespecified nor the initial centers to be manually selected.

Therefore, it has a “blind” feature, that is, k is not preselected.

The second phase performs the Lloyd iteration based on the results of the first phase.

The C-K-means algorithm combines the advantages of CA and K-means.

Experiments are carried out on the Spark platform, and the results verify the good scalability of the C-K-means algorithm.

This algorithm can effectively solve the problem of large-scale data clustering.

Extensive experiments on real data sets show that the accuracy and efficiency of the C-K-means algorithm outperforms the existing algorithms under both sequential and parallel conditions.

American Psychological Association (APA)

Zhang, Yiwen& Zhou, Yuanyuan& Guo, Xing& Wu, Jintao& He, Qiang& Liu, Xiao…[et al.]. 2018. Self-Adaptive K-Means Based on a Covering Algorithm. Complexity،Vol. 2018, no. 2018, pp.1-16.
https://search.emarefa.net/detail/BIM-1135921

Modern Language Association (MLA)

Zhang, Yiwen…[et al.]. Self-Adaptive K-Means Based on a Covering Algorithm. Complexity No. 2018 (2018), pp.1-16.
https://search.emarefa.net/detail/BIM-1135921

American Medical Association (AMA)

Zhang, Yiwen& Zhou, Yuanyuan& Guo, Xing& Wu, Jintao& He, Qiang& Liu, Xiao…[et al.]. Self-Adaptive K-Means Based on a Covering Algorithm. Complexity. 2018. Vol. 2018, no. 2018, pp.1-16.
https://search.emarefa.net/detail/BIM-1135921

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1135921

SaveSaved Print

Arab Citation & Impact Factor "Arcif"

Largest Arabic Database of Citations Analysis for the Arabic Scholarly Journals Issued in Arab World.

eMarefa Indicators
for Arab Scientific Production

"Kashif" for Checking Similarity or Plagiarism in the Arabic Researches. know more