Edge effects in calling variants from targeted amplicon sequencing.

BMC Genomics

Research and Foundation Department, QIAGEN Sciences, Inc,, Frederick, MD, USA.

Published: December 2014

Background: Analysis of targeted amplicon sequencing data presents some unique challenges in comparison to the analysis of random fragment sequencing data. Whereas reads from randomly fragmented DNA have arbitrary start positions, the reads from amplicon sequencing have fixed start positions that coincide with the amplicon boundaries. As a result, any variants near the amplicon boundaries can cause misalignments of multiple reads that can ultimately lead to false-positive or false-negative variant calls.

Results: We show that amplicon boundaries are variant calling blind spots where the variant calls are highly inaccurate. We propose that an effective strategy to avoid these blind spots is to incorporate the primer bases in obtaining read alignments and post-processing of the alignments, thereby effectively moving these blind spots into the primer binding regions (which are not used for variant calling). Targeted sequencing data analysis pipelines can provide better variant calling accuracy when primer bases are retained and sequenced.

Conclusions: Read bases beyond the variant site are necessary for analysis of amplicon sequencing data. Enzymatic primer digestion, if used in the target enrichment process, should leave at least a few primer bases to ensure that these bases are available during data analysis. The primer bases should only be removed immediately before the variant calling step to ensure that the variants can be called irrespective of where they occur within the amplicon insert region.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4302139PMC
http://dx.doi.org/10.1186/1471-2164-15-1073DOI Listing

Publication Analysis

Top Keywords

amplicon sequencing
16
sequencing data
16
variant calling
16
primer bases
16
amplicon boundaries
12
blind spots
12
amplicon
8
targeted amplicon
8
start positions
8
data analysis
8

Similar Publications

Objective: The study investigates the association between oral microbiome diversity and all-cause mortality.

Design: Population-based cohort study.

Setting: US National Health and Nutrition Examination Survey (2009-2010 and 2011-2012).

View Article and Find Full Text PDF

The present study investigates the supplemental effects of chia seed oil (CSO) on the growth performance and modulation of intestinal microbiota in Labeo rohita fingerlings. Four diets were formulated with graded levels of CSO: 1.0%, 2.

View Article and Find Full Text PDF

Primary ciliary dyskinesia (PCD, OMIM 244400) is a rare genetic disorder that affects motile cilia and is characterised by impaired mucociliary clearance of the airway epithelium, which results in chronic upper and lower airway infections. While short-read next-generation sequencing technology has been used for the genetic testing of PCD, its effectiveness is limited in identifying variants in the gene because of the nearly identical pseudogene As we confirmed that the gene was not expressed in airway cells, we obtained nasal mucosa biopsy specimens for total RNA sequencing (RNA-seq) with library enrichment using exome oligos. Among the 34 nasal samples from patients suspected of having PCD, three aberrant splicing patterns in were identified in two samples.

View Article and Find Full Text PDF

As a crucial source of potable water, the quality of water in Shanmei reservoir strongly and directly impacts the safety and well-being of downstream residents. Microorganisms play a pivotal role in the reservoir's resource and energy cycle. However, ecological protection efforts for the Shanmei reservoir have encountered numerous challenges in recent years.

View Article and Find Full Text PDF

Microbial coalescence plays a crucial role in shaping aquatic ecosystems by facilitating the merging of neighboring microbial communities, thereby influencing ecosystem structure. Although this phenomenon is commonly observed in natural environments, comprehensive quantitative comparative studies on different lifestyle bacteria involved in this process are still lacking. The study focuses on 16S rRNA Amplicon Sequence Variants (ASVs) at the Jinsha River hydropower stations (Wudongde [WDD], Baihetan [BHT], Xiluodu [XLD], Xiangjiaba [XJB]), specifically examining free-living (FL) and particle-attached (PA) bacteria.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!