A machine learning framework to trace tumor tissue-of-origin of 13 types of cancer based on DNA somatic mutation.

Biochim Biophys Acta Mol Basis Dis

Academician Workstation, Changsha Medical University, Changsha 410219, China; Geneis Beijing Co., Ltd., Beijing 100102, China. Electronic address:

Published: November 2020

Carcinoma of unknown primary (CUP), defined as metastatic cancers with unknown cancer origin, occurs in 3-5 per 100 cancer patients in the United States. Heterogeneity and metastasis of cancer brings great difficulties to the follow-up diagnosis and treatment for CUP. To find the tissue-of-origin (TOO) of the CUP, multiple methods have been raised. However, the accuracies for computed tomography (CT) and positron emission tomography (PET) to identify TOO were 20%-27% and 24%-40% respectively, which were not enough for determining targeted therapies. In this study, we provide a machine learning framework to trace tumor tissue origin by using gene length-normalized somatic mutation sequencing data. Somatic mutation data was downloaded from the Data Portal (Release 28) of the International Cancer Genome Consortium (ICGC), and 4909 samples for 13 cancers was used to identify primary site of cancers. Optimal results were obtained based on a 600-gene set by using the random forest algorithm with 10-fold cross-validation, and the average accuracy and F1-score were 0.8822 and 0.8886 respectively across 13 types of cancer. In conclusion, we provide an effective computational framework to infer cancer tissue-of-origin by combining DNA sequencing and machine learning techniques, which is promising in assisting clinical diagnosis of cancers.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.bbadis.2020.165916DOI Listing

Publication Analysis

Top Keywords

machine learning
12
somatic mutation
12
learning framework
8
framework trace
8
trace tumor
8
types cancer
8
cancer
7
tumor tissue-of-origin
4
tissue-of-origin types
4
cancer based
4

Similar Publications

Neurodevelopmental impairments associated with congenital heart disease (CHD) may arise from perturbations in brain developmental pathways, including the formation of sulcal patterns. While genetic factors contribute to sulcal features, the association of noncoding variants (ncDNVs) with sulcal patterns in people with CHD remains poorly understood. Leveraging deep learning models, we examined the predicted impact of ncDNVs on gene regulatory signals.

View Article and Find Full Text PDF

Objective: The vicious circle model of obesity proposes that the hippocampus plays a crucial role in food reward processing and obesity. However, few studies focused on whether and how pediatric obesity influences the potential direction of information exchange between the hippocampus and key regions, as well as whether these alterations in neural interaction could predict future BMI and eating behaviors.

Methods: In this longitudinal study, a total of 39 children with excess weight (overweight/obesity) and 51 children with normal weight, aged 8 to 12, underwent resting-state fMRI.

View Article and Find Full Text PDF

Background And Aims: Patient-reported outcomes (PROs) are vital in assessing disease activity and treatment outcomes in inflammatory bowel disease (IBD). However, manual extraction of these PROs from the free-text of clinical notes is burdensome. We aimed to improve data curation from free-text information in the electronic health record, making it more available for research and quality improvement.

View Article and Find Full Text PDF

Understanding the oxygen reduction reaction (ORR) mechanism and accurately characterizing the reaction interface are essential for improving fuel cell efficiency. We developed an active learning framework combining machine learning force fields and enhanced sampling to explore the dynamics and kinetics of the ORR on Fe-N/C using a fully explicit solvent model. Different possible reaction paths have been explored and the O adsorption process is confirmed as the rate-determining step of the ORR at the Fe-N/C-water interface, which needs to overcome a free energy barrier of 0.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!