Somatic variant detection is an integral part of cancer genomics analysis. While most methods have focused on short-read sequencing, long-read technologies now offer potential advantages in terms of repeat mapping and variant phasing. We present DeepSomatic, a deep learning method for detecting somatic SNVs and insertions and deletions (indels) from both short-read and long-read data, with modes for whole-genome and exome sequencing, and able to run on tumor-normal, tumor-only, and with FFPE-prepared samples. To help address the dearth of publicly available training and benchmarking data for somatic variant detection, we generated and make openly available a dataset of five matched tumor-normal cell line pairs sequenced with Illumina, PacBio HiFi, and Oxford Nanopore Technologies, along with benchmark variant sets. Across samples and technologies (short-read and long-read), DeepSomatic consistently outperforms existing callers, particularly for indels.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11370364PMC
http://dx.doi.org/10.1101/2024.08.16.608331DOI Listing

Publication Analysis

Top Keywords

somatic variant
8
variant detection
8
short-read long-read
8
variant
5
deepsomatic accurate
4
somatic
4
accurate somatic
4
somatic small
4
small variant
4
variant discovery
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!