AI Article Synopsis

  • Using reference panels from whole-genome sequencing (WGS) enhances the SNP data in genome-wide arrays, making it a cost-effective method.
  • The UK10K Cohorts project sequenced 3,781 genomes at low depth to explore genetic variation, particularly focusing on rare alleles in the British population.
  • Improved imputation accuracy for rare variants is achieved through re-phasing WGS panels and combining data from multiple sources, including the 1000 Genomes Project, while introducing a new method that balances speed and accuracy.

Article Abstract

Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4579394PMC
http://dx.doi.org/10.1038/ncomms9111DOI Listing

Publication Analysis

Top Keywords

imputation accuracy
16
rare variants
8
reference panels
8
accuracy rare
8
improved imputation
4
imputation low-frequency
4
low-frequency rare
4
variants uk10k
4
uk10k haplotype
4
haplotype reference
4

Similar Publications

Binding of transcription factors (TFs) at gene regulatory elements controls cellular epigenetic state and gene expression. Current genome-wide chromatin profiling approaches have inherently limited resolution, complicating assessment of TF occupancy and co-occupancy, especially at individual alleles. In this work, we introduce Accessible Chromatin by Cytosine Editing Site Sequencing with ATAC-seq (ACCESS-ATAC), which harnesses a double-stranded DNA cytosine deaminase (Ddd) enzyme to stencil TF binding locations within accessible chromatin regions.

View Article and Find Full Text PDF

A novel RFE-GRU model for diabetes classification using PIMA Indian dataset.

Sci Rep

January 2025

Department of Computer Science, Faculty of Computers and Information, Suez University, P. O. Box 43221, Suez, Egypt.

Diabetes is a long-term condition characterized by elevated blood sugar levels. It can lead to a variety of complex disorders such as stroke, renal failure, and heart attack. Diabetes requires the most machine learning help to diagnose diabetes illness at an early stage, as it cannot be treated and adds significant complications to our health-care system.

View Article and Find Full Text PDF

Atherosclerotic cardiovascular disease (ASCVD) risk calculators estimate the 10-year incident risk of myocardial infarction (MI), coronary artery disease (CAD) death, or stroke; however, they lack comprehensiveness and accuracy. Carotid intima-media thickness (CIMT) is a surrogate marker that may improve risk estimation acumen. The objective of this study was to derive ASCVD risk scores from historical data and determine whether these risk scores are associated with the history of subclinical CAD and CIMT.

View Article and Find Full Text PDF

Characterizing features affecting local ancestry inference performance in admixed populations.

Am J Hum Genet

December 2024

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; The Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA. Electronic address:

In recent years, significant efforts have been made to improve methods for genomic studies of admixed populations using local ancestry inference (LAI). Accurate LAI is crucial to ensure that downstream analyses accurately reflect the genetic ancestry of research participants. Here, we test analytic strategies for LAI to provide guidelines for optimal accuracy, focusing on admixed populations reflective of Latin America's primary continental ancestries-African (AFR), Amerindigenous (AMR), and European (EUR).

View Article and Find Full Text PDF

Performing a Multicenter Retrospective Study.

Hosp Pediatr

January 2025

Department of Pediatrics, Section of Hospital Medicine, Children's Hospital Colorado, University of Colorado School of Medicine, Aurora, Colorado.

Multicenter retrospective studies can provide a pragmatic approach to evaluating uncommon pediatric conditions and are less expensive than prospective research. A well-executed retrospective multicenter study, with rigorous study design, systematic data collection, and robust statistical analysis, can produce clinically important and generalizable findings A variety of observational designs can be employed, including cross-sectional, cohort, and case-control studies. Selection bias, ascertainment bias, and confounding are common issues in retrospective research.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!