Variant calling and quality control of large-scale human genome sequencing data.

Emerg Top Life Sci

Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, U.S.A.

Published: August 2019

Next-generation sequencing has allowed genetic studies to collect genome sequencing data from a large number of individuals. However, raw sequencing data are not usually interpretable due to fragmentation of the genome and technical biases; therefore, analysis of these data requires many computational approaches. First, for each sequenced individual, sequencing data are aligned and further processed to account for technical biases. Then, variant calling is performed to obtain information on the positions of genetic variants and their corresponding genotypes. Quality control (QC) is applied to identify individuals and genetic variants with sequencing errors. These procedures are necessary to generate accurate variant calls from sequencing data, and many computational approaches have been developed for these tasks. This review will focus on current widely used approaches for variant calling and QC.

Download full-text PDF

Source
http://dx.doi.org/10.1042/ETLS20190007DOI Listing

Publication Analysis

Top Keywords

sequencing data
20
variant calling
12
quality control
8
genome sequencing
8
technical biases
8
computational approaches
8
genetic variants
8
sequencing
7
data
6
variant
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!