In many real applications of machine learning and data mining, we are often confronted with high-dimensional data. How to cluster high-dimensional data is still a challenging problem due to the curse of dimensionality. In this paper, we try to address this problem using joint dimensionality reduction and clustering. Different from traditional approaches that conduct dimensionality reduction and clustering in sequence, we propose a novel framework referred to as discriminative embedded clustering which alternates them iteratively. Within this framework, we are able not only to view several traditional approaches and reveal their intrinsic relationships, but also to be stimulated to develop a new method. We also propose an effective approach for solving the formulated nonconvex optimization problem. Comprehensive analyses, including convergence behavior, parameter determination, and computational complexity, together with the relationship to other related approaches, are also presented. Plenty of experimental results on benchmark data sets illustrate that the proposed method outperforms related state-of-the-art clustering approaches and existing joint dimensionality reduction and clustering methods.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2014.2337335DOI Listing

Publication Analysis

Top Keywords

high-dimensional data
12
dimensionality reduction
12
reduction clustering
12
discriminative embedded
8
embedded clustering
8
joint dimensionality
8
traditional approaches
8
clustering
6
data
5
clustering framework
4

Similar Publications

Many human diseases result from a complex interplay of behavioral, clinical, and molecular factors. Integrating low-dimensional behavioral and clinical features with high-dimensional molecular profiles can significantly improve disease outcome prediction and diagnosis. However, while some biomarkers are crucial, many lack informative value.

View Article and Find Full Text PDF

Pituitary neuroendocrine tumors remain one of the most common intracranial tumors. While radiomic research related to pituitary tumors is progressing, public data sets for external validation remain scarce. We introduce an open dataset comprising high-resolution T1 contrast-enhanced MR scans of 136 patients with pituitary tumors, annotated for tumor segmentation and accompanied by clinical, radiological and pathological metadata.

View Article and Find Full Text PDF

Integrating multi-omics data may help researchers understand the genetic underpinnings of complex traits and diseases. However, the best ways to integrate multi-omics data and use them to address pressing scientific questions remain a challenge. One important and topical problem is how to assess the aggregate effect of multiple genomic data types (e.

View Article and Find Full Text PDF

Emerging trends in the optimization of organic synthesis through high-throughput tools and machine learning.

Beilstein J Org Chem

January 2025

Institute of Materials Research and Engineering (IMRE), Agency for Science Technology and Research (A*STAR), 2 Fusionopolis Way, Singapore 138634, Republic of Singapore.

The discovery of the optimal conditions for chemical reactions is a labor-intensive, time-consuming task that requires exploring a high-dimensional parametric space. Historically, the optimization of chemical reactions has been performed by manual experimentation guided by human intuition and through the design of experiments where reaction variables are modified one at a time to find the optimal conditions for a specific reaction outcome. Recently, a paradigm change in chemical reaction optimization has been enabled by advances in lab automation and the introduction of machine learning algorithms.

View Article and Find Full Text PDF

Due to the tremendous heterogeneity of disease manifestations, many complex diseases that were once thought to be single diseases are now considered to have disease subtypes. Disease subtyping analysis, that is the identification of subgroups of patients with similar characteristics, is the first step to accomplish precision medicine. With the advancement of high-throughput technologies, omics data offers unprecedented opportunity to reveal disease subtypes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!