An Initialization Method Based on Hybrid Distance for k-Means Algorithm.

Neural Comput

College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 200234, China

Published: November 2017

The traditional [Formula: see text]-means algorithm has been widely used as a simple and efficient clustering method. However, the performance of this algorithm is highly dependent on the selection of initial cluster centers. Therefore, the method adopted for choosing initial cluster centers is extremely important. In this letter, we redefine the density of points according to the number of its neighbors, as well as the distance between points and their neighbors. In addition, we define a new distance measure that considers both Euclidean distance and density. Based on that, we propose an algorithm for selecting initial cluster centers that can dynamically adjust the weighting parameter. Furthermore, we propose a new internal clustering validation measure, the clustering validation index based on the neighbors (CVN), which can be exploited to select the optimal result among multiple clustering results. Experimental results show that the proposed algorithm outperforms existing initialization methods on real-world data sets and demonstrates the adaptability of the proposed algorithm to data sets with various characteristics.

Download full-text PDF

Source
http://dx.doi.org/10.1162/neco_a_01014DOI Listing

Publication Analysis

Top Keywords

initial cluster
12
cluster centers
12
clustering validation
8
proposed algorithm
8
data sets
8
algorithm
6
initialization method
4
method based
4
based hybrid
4
distance
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!