Towards clinically more relevant dissection of patient heterogeneity via survival-based Bayesian clustering.

Bioinformatics

Bonn Aachen International Center for Information Technology, University of Bonn, 53127 Bonn, Germany.

Published: November 2017

Motivation: Discovery of clinically relevant disease sub-types is of prime importance in personalized medicine. Disease sub-type identification has in the past often been explored in an unsupervised machine learning paradigm which involves clustering of patients based on available-omics data, such as gene expression. A follow-up analysis involves determining the clinical relevance of the molecular sub-types such as that reflected by comparing their disease progressions. The above methodology, however, fails to guarantee the separability of the sub-types based on their subtype-specific survival curves.

Results: We propose a new algorithm, Survival-based Bayesian Clustering (SBC) which simultaneously clusters heterogeneous-omics and clinical end point data (time to event) in order to discover clinically relevant disease subtypes. For this purpose we formulate a novel Hierarchical Bayesian Graphical Model which combines a Dirichlet Process Gaussian Mixture Model with an Accelerated Failure Time model. In this way we make sure that patients are grouped in the same cluster only when they show similar characteristics with respect to molecular features across data types (e.g. gene expression, mi-RNA) as well as survival times. We extensively test our model in simulation studies and apply it to cancer patient data from the Breast Cancer dataset and The Cancer Genome Atlas repository. Notably, our method is not only able to find clinically relevant sub-groups, but is also able to predict cluster membership and survival on test data in a better way than other competing methods.

Availability And Implementation: Our R-code can be accessed as https://github.com/ashar799/SBC.

Contact: ashar@bit.uni-bonn.de.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btx464DOI Listing

Publication Analysis

Top Keywords

clinically relevant
16
survival-based bayesian
8
bayesian clustering
8
relevant disease
8
gene expression
8
data
6
clinically
4
relevant dissection
4
dissection patient
4
patient heterogeneity
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!