Deep contrastive learning for predicting cancer prognosis using gene expression values.

Brief Bioinform

Department of Electrical and Computer Engineering, University of Miami, Miami, FL 33146, United States.

Published: September 2024

AI Article Synopsis

  • Recent advances in image classification show that contrastive learning (CL) can enhance feature representation from limited data samples; this study applies CL to tumor transcriptomes and clinical data to identify features in a low-dimensional space.
  • The classifiers trained using CL improved accuracy, achieving an AUC over 0.8 for 14 cancer types and over 0.9 for 3 cancer types, thus demonstrating a significant enhancement over existing classifications.
  • The study introduced contrastive learning-based Cox models (CLCox) for cancer prognosis, which outperformed traditional methods and validated their effectiveness using independent patient data, with the trained models and Python codes made publicly available for clinical use.

Article Abstract

Recent advancements in image classification have demonstrated that contrastive learning (CL) can aid in further learning tasks by acquiring good feature representation from a limited number of data samples. In this paper, we applied CL to tumor transcriptomes and clinical data to learn feature representations in a low-dimensional space. We then utilized these learned features to train a classifier to categorize tumors into a high- or low-risk group of recurrence. Using data from The Cancer Genome Atlas (TCGA), we demonstrated that CL can significantly improve classification accuracy. Specifically, our CL-based classifiers achieved an area under the receiver operating characteristic curve (AUC) greater than 0.8 for 14 types of cancer, and an AUC greater than 0.9 for 3 types of cancer. We also developed CL-based Cox (CLCox) models for predicting cancer prognosis. Our CLCox models trained with the TCGA data outperformed existing methods significantly in predicting the prognosis of 19 types of cancer under consideration. The performance of CLCox models and CL-based classifiers trained with TCGA lung and prostate cancer data were validated using the data from two independent cohorts. We also show that the CLCox model trained with the whole transcriptome significantly outperforms the Cox model trained with the 16 genes of Oncotype DX that is in clinical use for breast cancer patients. The trained models and the Python codes are publicly accessible and provide a valuable resource that will potentially find clinical applications for many types of cancer.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11521346PMC
http://dx.doi.org/10.1093/bib/bbae544DOI Listing

Publication Analysis

Top Keywords

types cancer
16
clcox models
12
cancer
9
contrastive learning
8
predicting cancer
8
cancer prognosis
8
cl-based classifiers
8
auc greater
8
greater types
8
trained tcga
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!