High-throughput sequencing (HTS) of viral samples provides important information on the presence of viral minority variants. However, detection and accurate quantification is limited by the capacity to distinguish biological from artificial variation. In this study, errors related to the Illumina HiSeq2000 library generation and HTS process were investigated by determining minority variant frequencies in an influenza A/WSN/1933(H1N1) virus reverse-genetics plasmid pool. Errors related to amplification and sequencing were determined using the same plasmid pool, by generation of infectious virus using reverse genetics followed by in duplo reverse-transcriptase PCR (RT-PCR) amplification and HTS in the same sequence run. Results showed that after "best practice" quality control (QC), within the plasmid pool, one minority variant with a frequency >0.5% was identified, while 84 and 139 were identified in the RT-PCR amplified samples, indicating RT-PCR amplification artificially increased variation. Detailed analysis showed that artifactual minority variants could be identified by two major technical characteristics: their predominant presence in a single read orientation and uneven distribution of mismatches over the length of the reads. We demonstrate that by addition of two QC steps 95% of the artifactual minority variants could be identified. When our analysis approach was applied to three clinical samples 68% of the initially identified minority variants were identified as artifacts. Our study clearly demonstrated that, without additional QC steps, overestimation of viral minority variants is very likely to occur, mainly as a consequence of the required RT-PCR amplification step. The improved ability to detect and correct for artifactual minority variants, increases data resolution and could aid both past and future studies incorporating HTS. The source code has been made available through Sourceforge (https://sourceforge.net/projects/mva-ngs).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4302989PMC
http://dx.doi.org/10.3389/fmicb.2014.00804DOI Listing

Publication Analysis

Top Keywords

minority variants
28
viral minority
12
plasmid pool
12
rt-pcr amplification
12
artifactual minority
12
variants identified
12
minority
9
high-throughput sequencing
8
minority variant
8
variants
7

Similar Publications

Oligogenic risk score for Gilles de la Tourette syndrome reveals a genetic continuum of tic disorders.

J Appl Genet

January 2025

Department of Neurogenetics and Functional Genomics, Mossakowski Medical Research Institute, Polish Academy of Sciences, Pawińskiego 5, 02-106, Warsaw, Poland.

Gilles de la Tourette syndrome (GTS) and other tic disorders (TDs) have a substantial genetic component with their heritability estimated at between 60 and 80%. Here we propose an oligogenic risk score of TDs using whole-genome sequencing (WGS) data from a group of Polish GTS patients, their families, and control samples (n = 278). In this study, we first reviewed the literature to obtain a preliminary list of 84 GTS/TD candidate genes.

View Article and Find Full Text PDF

The Kabuki syndrome (KS) is a rare congenital disease that has two different types, KS1 and KS2, with variant in epigenetic gene KMT2D and KDM6A, respectively. It is associated with multiple abnormalities such as (developmental delay, atypical facial features, cardiac anomalies, minor skeleton anomalies, genitourinary anomalies, and mild to moderate intellectual disability). This syndrome can lead to neonatal hypoglycemia that results from hyperinsulinemia and electrolyte abnormalities.

View Article and Find Full Text PDF

Pharmacogenetic testing can prevent severe toxicities from several oncology drug therapies; it also has the potential to improve the outcomes from supportive care drugs. Paired tumor and germline sequencing is increasingly common in oncology practice; these include sequencing of pharmacogenes, but the germline pharmacogenetic variants are rarely included in the clinical reports, despite many being clinically actionable. We established an informatics workflow to evaluate the clinical sequencing results for pharmacogenetic variants.

View Article and Find Full Text PDF

Background: Despite evidence that Alzheimer’s disease (AD) is highly heritable, there remains substantial “missing” heritability, likely due in part to the effect of rare variants and to the past reliance on case‐control analysis. Here, we leverage powerful endophenotypes of AD (cognitive performance across multiple cognitive domains) in a rare variant analysis to identify novel genetic drivers of cognition in aging and disease.

Method: We leveraged 8 cohorts of cognitive aging with whole genome sequencing data from the AD Sequencing Project to conduct rare variant analyses of multiple domains of cognition (N = 9,317; mean age = 73; 56% female; 52% cognitively unimpaired).

View Article and Find Full Text PDF

Background: 20‐hydroxyeicosatetranoic acid (20‐HETE) is a potent vasoconstrictor synthesized by the CYP4F2 enzyme. The missense variant rs3093105 A>C (W12G) has been implicated in hypertension and stroke, risk factors for Alzheimer’s disease (AD), and with decreased 20‐HETE activity. To explore the potential role of the CYP4F2/20‐HETE pathway and AD, this study investigated associations between the rs3093105 variant and AD phenotypes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!