Background: Sequencing studies of exonic regions aim to identify rare variants contributing to complex traits. With high coverage and large sample size, these studies tend to apply simple variant calling algorithms. However, coverage is often heterogeneous; sites with insufficient coverage may benefit from sophisticated calling algorithms used in low-coverage sequencing studies. We evaluate the potential benefits of different calling strategies by performing a comparative analysis of variant calling methods on exonic data from 202 genes sequenced at 24x in 7,842 individuals. We call variants using individual-based, population-based and linkage disequilibrium (LD)-aware methods with stringent quality control. We measure genotype accuracy by the concordance with on-target GWAS genotypes and between 80 pairs of sequencing replicates. We validate selected singleton variants using capillary sequencing.
Results: Using these calling methods, we detected over 27,500 variants at the targeted exons; >57% were singletons. The singletons identified by individual-based analyses were of the highest quality. However, individual-based analyses generated more missing genotypes (4.72%) than population-based (0.47%) and LD-aware (0.17%) analyses. Moreover, individual-based genotypes were the least concordant with array-based genotypes and replicates. Population-based genotypes were less concordant than genotypes from LD-aware analyses with extended haplotypes. We reanalyzed the same dataset with a second set of callers and showed again that the individual-based caller identified more high-quality singletons than the population-based caller. We also replicated this result in a second dataset of 57 genes sequenced at 127.5x in 3,124 individuals.
Conclusions: We recommend population-based analyses for high quality variant calls with few missing genotypes. With extended haplotypes, LD-aware methods generate the most accurate and complete genotypes. In addition, individual-based analyses should complement the above methods to obtain the most singleton variants.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4359451 | PMC |
http://dx.doi.org/10.1186/s12859-015-0489-0 | DOI Listing |
Sci Rep
December 2024
BioMag Laboratory, HUS Diagnostic Center, Helsinki University Hospital, University of Helsinki and Aalto University School of Science, Helsinki, Finland.
A novel variant of paired-associative stimulation (PAS) consisting of high-frequency peripheral nerve stimulation (PNS) and high-intensity transcranial magnetic stimulation (TMS) above the motor cortex, called high-PAS, can lead to improved motor function in patients with incomplete spinal cord injury. In PAS, the interstimulus interval (ISI) between the PNS and TMS pulses plays a significant role in the location of the intended effect of the induced plastic changes. While conventional PAS protocols (single TMS pulse often applied with intensity close to resting motor threshold, and single PNS pulse) usually require precisely defined ISIs, high-PAS can induce plasticity at a wide range of ISIs and also in spite of small ISI errors, which is helpful in clinical settings where precise ISI determination can be challenging.
View Article and Find Full Text PDFPLoS One
December 2024
Embrapa Southeast Livestock, São Carlos, Brazil.
Different sheep breeds show distinct phenotypic plasticity in fat deposition in the tails. The genetic background underlying fat deposition in the tail of sheep is complex, multifactorial, and may involve allele-specific expression (ASE) mechanism to modulate allelic expression. ASE is a common phenomenon in mammals and refers to allelic imbalanced expression modified by cis-regulatory genetic variants that can be observed at heterozygous loci.
View Article and Find Full Text PDFJCI Insight
December 2024
Department of Ophthalmology and Roger and Karalis Johnson Retina Center, University of Washington, Seattle, United States of America.
Background: Current clinical sequencing methods cannot effectively detect DNA methylation and allele-specific variation to provide parent-of-origin information from the proband alone. Parent-of-origin effects can lead to differential disease and the inability to assign this in de novo cases limits prognostication in the majority of affected individuals with retinoblastoma, a hereditary cancer with suspected parent-of-origin effects.
Methods: To directly assign parent-of-origin in retinoblastoma patients, genomic DNA was extracted from blood samples for sequencing using a programmable, targeted single-molecule long-read DNA genomic and epigenomic approach.
Bioinformatics
December 2024
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States.
Motivation: The Variant Call Format (VCF) is widely used in genome sequencing but scales poorly. For instance, we estimate a 150,000 genome VCF would occupy 900 TiB, making it costly and complicated to produce, analyze, and store. The issue stems from VCF's requirement to densely represent both reference-genotypes and allele-indexed arrays.
View Article and Find Full Text PDFParasit Vectors
December 2024
Department of Biology, College of Arts and Sciences, Baylor University, Waco, TX, USA.
Background: The high burden of malaria in Africa is largely due to the presence of competent and adapted Anopheles vector species. With invasive Anopheles stephensi implicated in malaria outbreaks in Africa, understanding the genomic basis of vector-parasite compatibility is essential for assessing the risk of future outbreaks due to this mosquito. Vector compatibility with P.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!