Linked-read sequencing enables greatly improves haplotype assembly over standard paired-end analysis. The detection of mosaic single-nucleotide variants benefits from haplotype assembly when the model is informed by the mapping between constituent reads and linked reads. Samovar evaluates haplotype-discordant reads identified through linked-read sequencing, thus enabling phasing and mosaic variant detection across the entire genome. Samovar trains a random forest model to score candidate sites using a dataset that considers read quality, phasing, and linked-read characteristics. Samovar calls mosaic single-nucleotide variants (SNVs) within a single sample with accuracy comparable with what previously required trios or matched tumor/normal pairs and outperforms single-sample mosaic variant callers at minor allele frequency 5%-50% with at least 30X coverage. Samovar finds somatic variants in both tumor and normal whole-genome sequencing from 13 pediatric cancer cases that can be corroborated with high recall with whole exome sequencing. Samovar is available open-source at https://github.com/cdarby/samovar under the MIT license.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6609817PMC
http://dx.doi.org/10.1016/j.isci.2019.05.037DOI Listing

Publication Analysis

Top Keywords

mosaic single-nucleotide
12
single-sample mosaic
8
linked reads
8
linked-read sequencing
8
haplotype assembly
8
single-nucleotide variants
8
mosaic variant
8
samovar
6
mosaic
5
samovar single-sample
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!