Integrative cancer patient stratification via subspace merging.

Bioinformatics

Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA.

Published: May 2019

AI Article Synopsis

  • Technologies generating high-throughput omics data are rapidly expanding, creating large public databases that require advanced computational methods to analyze and cluster patient information for specific diseases.
  • The proposed method constructs a similarity graph for patients based on various omics data types, effectively merging them to form more informative clusters, demonstrated through analysis of breast cancer data from The Cancer Genome Atlas.
  • The study identifies significant genetic clusters related to patient survival rates and shows that this new approach compares favorably against previous methods, providing a useful tool for understanding cancer subtypes.

Article Abstract

Motivation: Technologies that generate high-throughput omics data are flourishing, creating enormous, publicly available repositories of multi-omics data. As many data repositories continue to grow, there is an urgent need for computational methods that can leverage these data to create comprehensive clusters of patients with a given disease.

Results: Our proposed approach creates a patient-to-patient similarity graph for each data type as an intermediate representation of each omics data type and merges the graphs through subspace analysis on a Grassmann manifold. We hypothesize that this approach generates more informative clusters by preserving the complementary information from each level of omics data. We applied our approach to The Cancer Genome Atlas (TCGA) breast cancer dataset and show that by integrating gene expression, microRNA and DNA methylation data, our proposed method can produce clinically useful subtypes of breast cancer. We then investigate the molecular characteristics underlying these subtypes. We discover a highly expressed cluster of genes on chromosome 19p13 that strongly correlates with survival in TCGA breast cancer patients and validate these results in three additional breast cancer datasets. We also compare our approach with previous integrative clustering approaches and obtain comparable or superior results.

Availability And Implementation: https://github.com/michaelsharpnack/GrassmannCluster.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6513164PMC
http://dx.doi.org/10.1093/bioinformatics/bty866DOI Listing

Publication Analysis

Top Keywords

breast cancer
16
omics data
12
data
9
data type
8
tcga breast
8
cancer
5
integrative cancer
4
cancer patient
4
patient stratification
4
stratification subspace
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!