Mutation signatures in cancer genomes reflect endogenous and exogenous mutational processes, offering insights into tumour etiology, features for prognostic and biologic stratification and vulnerabilities to be exploited therapeutically. We present a novel machine learning formalism for improved signature inference, based on multi-modal correlated topic models (MMCTM) which can at once infer signatures from both single nucleotide and structural variation counts derived from cancer genome sequencing data. We exemplify the utility of our approach on two hormone driven, DNA repair deficient cancers: breast and ovary (n = 755 samples total). We show how introducing correlated structure both within and between modes of mutation can increase accuracy of signature discovery, particularly in the context of sparse data. Our study emphasizes the importance of integrating multiple mutation modes for signature discovery and patient stratification, and provides a statistical modeling framework to incorporate additional features of interest for future studies.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6402697PMC
http://dx.doi.org/10.1371/journal.pcbi.1006799DOI Listing

Publication Analysis

Top Keywords

structural variation
8
mutation signatures
8
signatures cancer
8
cancer genomes
8
correlated topic
8
topic models
8
signature discovery
8
integrated structural
4
variation point
4
mutation
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!