Background: Uncovering subtypes of disease from microarray samples has important clinical implications such as survival time and sensitivity of individual patients to specific therapies. Unsupervised clustering methods have been used to classify this type of data. However, most existing methods focus on clusters with compact shapes and do not reflect the geometric complexity of the high dimensional microarray clusters, which limits their performance.

Results: We present a cluster-number-based ensemble clustering algorithm, called MULTI-K, for microarray sample classification, which demonstrates remarkable accuracy. The method amalgamates multiple k-means runs by varying the number of clusters and identifies clusters that manifest the most robust co-memberships of elements. In addition to the original algorithm, we newly devised the entropy-plot to control the separation of singletons or small clusters. MULTI-K, unlike the simple k-means or other widely used methods, was able to capture clusters with complex and high-dimensional structures accurately. MULTI-K outperformed other methods including a recently developed ensemble clustering algorithm in tests with five simulated and eight real gene-expression data sets.

Conclusion: The geometric complexity of clusters should be taken into account for accurate classification of microarray data, and ensemble clustering applied to the number of clusters tackles the problem very well. The C++ code and the data sets tested are available from the authors.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2743671PMC
http://dx.doi.org/10.1186/1471-2105-10-260DOI Listing

Publication Analysis

Top Keywords

ensemble clustering
12
accurate classification
8
classification microarray
8
clusters
8
geometric complexity
8
clustering algorithm
8
number clusters
8
microarray
5
clustering
5
multi-k
4

Similar Publications

Exploring continuous time crystals (CTCs) within the symmetric subspace of spin systems has been a subject of intensive research in recent times. Thus far, the stability of the time-crystal phase outside the symmetric subspace in such spin systems has gone largely unexplored. Here, we investigate the effect of including the asymmetric subspaces on the dynamics of CTCs in a driven dissipative spin model.

View Article and Find Full Text PDF

Trigeminal nerve microstructure is linked with neuroinflammation and brainstem activity in migraine.

Brain

January 2025

Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02129, USA.

Although the pathophysiology of migraine involves a complex ensemble of peripheral and central nervous system changes that remain incompletely understood, the activation and sensitization of the trigeminovascular system is believed to play a major role. However, non-invasive, in vivo neuroimaging studies investigating the underlying neural mechanisms of trigeminal system abnormalities in human migraine patients are limited. Here, we studied 60 patients with migraine (55 females, mean age ± SD: 36.

View Article and Find Full Text PDF

Teravoxel-scale, cellular-resolution images of cleared rodent brains acquired with light-sheet fluorescence microscopy have transformed the way we study the brain. Realizing the potential of this technology requires computational pipelines that generalize across experimental protocols and map neuronal activity at the laminar and subpopulation-specific levels, beyond atlas-defined regions. Here, we present artficial intelligence-based cartography of ensembles (ACE), an end-to-end pipeline that employs three-dimensional deep learning segmentation models and advanced cluster-wise statistical algorithms, to enable unbiased mapping of local neuronal activity and connectivity.

View Article and Find Full Text PDF

Multiclass imbalance is a challenging problem in real-world datasets, where certain classes may have a low number of samples because they correspond to rare occurrences. To address the challenge of multiclass imbalance, this paper introduces a novel hybrid cluster-based oversampling and undersampling (HCBOU) technique. By clustering and separating classes into majority and minority categories, this algorithm retains the most information during undersampling while generating efficient data in the minority class.

View Article and Find Full Text PDF

The current work introduces the hybrid ensemble framework for the detection and segmentation of colorectal cancer. This framework will incorporate both supervised classification and unsupervised clustering methods to present more understandable and accurate diagnostic results. The method entails several steps with CNN models: ADa-22 and AD-22, transformer networks, and an SVM classifier, all inbuilt.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!