Denoising of Aligned Genomic Data.

Sci Rep

University of Illinois Urbana-Champaign, Carl R. Woese Institute for Genomic Biology, Urbana, 61801, USA.

Published: October 2019

Noise in genomic sequencing data is known to have effects on various stages of genomic data analysis pipelines. Variant identification is an important step of many of these pipelines, and is increasingly being used in clinical settings to aid medical practices. We propose a denoising method, dubbed SAMDUDE, which operates on aligned genomic data in order to improve variant calling performance. Denoising human data with SAMDUDE resulted in improved variant identification in both individual chromosome as well as whole genome sequencing (WGS) data sets. In the WGS data set, denoising led to identification of almost 2,000 additional true variants, and elimination of over 1,500 erroneously identified variants. In contrast, we found that denoising with other state-of-the-art denoisers significantly worsens variant calling performance. SAMDUDE is written in Python and is freely available at https://github.com/ihwang/SAMDUDE .

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6803637PMC
http://dx.doi.org/10.1038/s41598-019-51418-zDOI Listing

Publication Analysis

Top Keywords

genomic data
12
aligned genomic
8
variant identification
8
variant calling
8
calling performance
8
wgs data
8
data
7
denoising
5
denoising aligned
4
genomic
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!