Clustering validation indexes are intended to assess the goodness of clustering results. Many methods used to estimate the number of clusters rely on a validation index as a key element to find the correct answer. This paper presents a new validation index based on graph concepts, which has been designed to find arbitrary shaped clusters by exploiting the spatial layout of the patterns and their clustering label. This new clustering index is combined with a solid statistical detection framework, the gap statistic. The resulting method is able to find the right number of arbitrary-shaped clusters in diverse situations, as we show with examples where this information is available. A comparison with several relevant validation methods is carried out using artificial and gene expression data sets. The results are very encouraging, showing that the underlying structure in the data can be more accurately detected with the new clustering index. Our gene expression data results also indicate that this new index is stable under perturbation of the input data.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2013.32DOI Listing

Publication Analysis

Top Keywords

arbitrary-shaped clusters
8
gene expression
8
expression data
8
clusters
5
clustering
5
clusters validation
4
validation arbitrary-shaped
4
clusters clustering
4
validation
4
clustering validation
4

Similar Publications

A mutual neighbor-based clustering method and its medical applications.

Comput Biol Med

November 2022

Shaoxing University, Shaoxing 312000, PR China. Electronic address:

Clustering analysis has been widely used in various real-world applications. Due to the simplicity of K-means, it has become the most popular clustering analysis technique in reality. Unfortunately, the performance of K-means heavily relies on initial centers, which should be specified in prior.

View Article and Find Full Text PDF

Low-rank discrete multi-view spectral clustering.

Neural Netw

September 2023

Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing 400065, China.

Spectral clustering has attracted intensive attention in multimedia applications due to its good performance on arbitrary shaped clusters and well-defined mathematical framework. However, most existing multi-view spectral clustering methods still have the following demerits: (1) They ignore useful complementary information embedded in indicator matrices of different views. (2) The conventional post-processing methods based on the relax and discrete strategy inevitably result in the sub-optimal discrete solution.

View Article and Find Full Text PDF

Due to simplicity, K-means has become a widely used clustering method. However, its clustering result is seriously affected by the initial centers and the allocation strategy makes it hard to identify manifold clusters. Many improved K-means are proposed to accelerate it and improve the quality of initialize cluster centers, but few researchers pay attention to the shortcoming of K-means in discovering arbitrary-shaped clusters.

View Article and Find Full Text PDF

VIASCKDE Index: A Novel Internal Cluster Validity Index for Arbitrary-Shaped Clusters Based on the Kernel Density Estimation.

Comput Intell Neurosci

June 2022

Department of Computer Engineering, Faculty of Engineering, Tarsus University, Mersin, Turkey.

The cluster evaluation process is of great importance in areas of machine learning and data mining. Evaluating the clustering quality of clusters shows how much any proposed approach or algorithm is competent. Nevertheless, evaluating the quality of any cluster is still an issue.

View Article and Find Full Text PDF

We introduce a density functional theory/molecular mechanical approach for computation of linear response properties of molecules in heterogeneous environments, such as metal surfaces or nanoparticles embedded in solvents. The heterogeneous embedding environment, consisting from metallic and nonmetallic parts, is described by combined force fields, where conventional force fields are used for the nonmetallic part and capacitance-polarization-based force fields are used for the metallic part. The presented approach enables studies of properties and spectra of systems embedded in or placed at arbitrary shaped metallic surfaces, clusters, or nanoparticles.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!