In many real applications of machine learning and data mining, we are often confronted with high-dimensional data. How to cluster high-dimensional data is still a challenging problem due to the curse of dimensionality. In this paper, we try to address this problem using joint dimensionality reduction and clustering. Different from traditional approaches that conduct dimensionality reduction and clustering in sequence, we propose a novel framework referred to as discriminative embedded clustering which alternates them iteratively. Within this framework, we are able not only to view several traditional approaches and reveal their intrinsic relationships, but also to be stimulated to develop a new method. We also propose an effective approach for solving the formulated nonconvex optimization problem. Comprehensive analyses, including convergence behavior, parameter determination, and computational complexity, together with the relationship to other related approaches, are also presented. Plenty of experimental results on benchmark data sets illustrate that the proposed method outperforms related state-of-the-art clustering approaches and existing joint dimensionality reduction and clustering methods.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TNNLS.2014.2337335 | DOI Listing |
Brief Bioinform
November 2024
Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States.
Many human diseases result from a complex interplay of behavioral, clinical, and molecular factors. Integrating low-dimensional behavioral and clinical features with high-dimensional molecular profiles can significantly improve disease outcome prediction and diagnosis. However, while some biomarkers are crucial, many lack informative value.
View Article and Find Full Text PDFSci Data
January 2025
Victor Horsley Department of Neurosurgery, National Hospital for Neurology and Neurosurgery, London, UK.
Pituitary neuroendocrine tumors remain one of the most common intracranial tumors. While radiomic research related to pituitary tumors is progressing, public data sets for external validation remain scarce. We introduce an open dataset comprising high-resolution T1 contrast-enhanced MR scans of 136 patients with pituitary tumors, annotated for tumor segmentation and accompanied by clinical, radiological and pathological metadata.
View Article and Find Full Text PDFGenet Epidemiol
January 2025
Department of Biostatistics, University of Washington, Seattle, Washington, USA.
Integrating multi-omics data may help researchers understand the genetic underpinnings of complex traits and diseases. However, the best ways to integrate multi-omics data and use them to address pressing scientific questions remain a challenge. One important and topical problem is how to assess the aggregate effect of multiple genomic data types (e.
View Article and Find Full Text PDFBeilstein J Org Chem
January 2025
Institute of Materials Research and Engineering (IMRE), Agency for Science Technology and Research (A*STAR), 2 Fusionopolis Way, Singapore 138634, Republic of Singapore.
The discovery of the optimal conditions for chemical reactions is a labor-intensive, time-consuming task that requires exploring a high-dimensional parametric space. Historically, the optimization of chemical reactions has been performed by manual experimentation guided by human intuition and through the design of experiments where reaction variables are modified one at a time to find the optimal conditions for a specific reaction outcome. Recently, a paradigm change in chemical reaction optimization has been enabled by advances in lab automation and the introduction of machine learning algorithms.
View Article and Find Full Text PDFJ Appl Stat
June 2024
Department of Biostatistics, University of Florida, Gainesville, FL, USA.
Due to the tremendous heterogeneity of disease manifestations, many complex diseases that were once thought to be single diseases are now considered to have disease subtypes. Disease subtyping analysis, that is the identification of subgroups of patients with similar characteristics, is the first step to accomplish precision medicine. With the advancement of high-throughput technologies, omics data offers unprecedented opportunity to reveal disease subtypes.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!