Model-based multifacet clustering with high-dimensional omics applications.

Biostatistics

Department of Biostatistics, University of Pittsburgh, 130 De Soto St, Pittsburgh, PA 15261, United States.

Published: July 2024

High-dimensional omics data often contain intricate and multifaceted information, resulting in the coexistence of multiple plausible sample partitions based on different subsets of selected features. Conventional clustering methods typically yield only one clustering solution, limiting their capacity to fully capture all facets of cluster structures in high-dimensional data. To address this challenge, we propose a model-based multifacet clustering (MFClust) method based on a mixture of Gaussian mixture models, where the former mixture achieves facet assignment for gene features and the latter mixture determines cluster assignment of samples. We demonstrate superior facet and cluster assignment accuracy of MFClust through simulation studies. The proposed method is applied to three transcriptomic applications from postmortem brain and lung disease studies. The result captures multifacet clustering structures associated with critical clinical variables and provides intriguing biological insights for further hypothesis generation and discovery.

Download full-text PDF

Source
http://dx.doi.org/10.1093/biostatistics/kxae020DOI Listing

Publication Analysis

Top Keywords

multifacet clustering
12
model-based multifacet
8
high-dimensional omics
8
cluster assignment
8
clustering
5
clustering high-dimensional
4
omics applications
4
applications high-dimensional
4
omics data
4
data intricate
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!