In the search for genetic associations with complex traits, population isolates offer the advantage of reduced genetic and environmental heterogeneity. In addition, cost-efficient next-generation association approaches have been proposed in these populations where only a subsample of representative individuals is sequenced and then genotypes are imputed into the rest of the population. Gene mapping in such populations thus requires high-quality genetic imputation and preliminary phasing. To identify an effective study design, we compare by simulation a range of phasing and imputation software and strategies. We simulated 1,115,604 variants on chromosome 10 for 477 members of the large complex pedigree of Campora, a village within the established isolate of Cilento in southern Italy. We assessed the phasing performance of identical by descent based software ALPHAPHASE and SLRP, LD-based software SHAPEIT2, SHAPEIT3, and BEAGLE, and new software EAGLE that combines both methodologies. For imputation we compared IMPUTE2, IMPUTE4, MINIMAC3, BEAGLE, and new software PBWT. Genotyping errors and missing genotypes were simulated to observe their effects on the performance of each software. Highly accurate phased data were achieved by all software with SHAPEIT2, SHAPEIT3, and EAGLE2 providing the most accurate results. MINIMAC3, IMPUTE4, and IMPUTE2 all performed strongly as imputation software and our study highlights the considerable gain in imputation accuracy provided by a genome sequenced reference panel specific to the population isolate.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1002/gepi.22109 | DOI Listing |
J Am Stat Assoc
October 2023
Department of Statistics, University of Michigan.
In statistical genetics, the sequentially Markov coalescent (SMC) is an important family of models for approximating the distribution of genetic variation data under complex evolutionary models. Methods based on SMC are widely used in genetics and evolutionary biology, with significant applications to genotype phasing and imputation, recombination rate estimation, and inferring population history. SMC allows for likelihood-based inference using hidden Markov models (HMMs), where the latent variable represents a genealogy.
View Article and Find Full Text PDFBioinform Adv
July 2024
Institute of Clinical Molecular Biology, Kiel University, Am Botanischen Garten 11, 24108 Kiel, Germany.
Summary: We recently developed , a free software that combines genotype phasing and imputation in a single tool. By introducing algorithmic and technical improvements we accelerated the classical two-step approach using and . Here, we demonstrate how to use field-programmable gate arrays (FPGAs) to accelerate even further by a factor of up to 93% without loss of phasing and imputation quality.
View Article and Find Full Text PDFTransplant Cell Ther
November 2024
Translational Science and Therapeutics Division, Fred Hutchinson Cancer Center, Seattle, Washington; Department of Medicine, University of Washington, Seattle, Washington.
When optimizing transplants, clinical decision-makers consider HLA-A, -B, -C, -DRB1 (8 matched alleles out of 8), and sometimes HLA-DQB1 (10 out of 10) matching between the patient and donor. HLA-DQ is a heterodimer formed by the β chain product of HLA-DQB1 and an α chain product of HLA-DQA1. In addition to molecules defined by the parentally inherited cis haplotypes, α-β trans-dimerization is possible between certain alleles, leading to unique molecules and a potential source of mismatched molecules.
View Article and Find Full Text PDFNat Genet
September 2024
Department of Statistics, University of Oxford, Oxford, UK.
We built a reference panel with 342 million autosomal variants using 78,195 individuals from the Genomics England (GEL) dataset, achieving a phasing switch error rate of 0.18% for European samples and imputation quality of r = 0.75 for variants with minor allele frequencies as low as 2 × 10 in white British samples.
View Article and Find Full Text PDFG3 (Bethesda)
September 2024
United States Department of Agriculture, National Center for Cool and Cold Water Aquaculture, Agricultural Research Service, Kearneysville, WV 25430, USA.
With the rapid and significant cost reduction of next-generation sequencing, low-coverage whole-genome sequencing (lcWGS), followed by genotype imputation, is becoming a cost-effective alternative to single-nucleotide polymorphism (SNP)-array genotyping. The objectives of this study were 2-fold: (1) construct a haplotype reference panel for genotype imputation from lcWGS data in rainbow trout (Oncorhynchus mykiss); and (2) evaluate the concordance between imputed genotypes and SNP-array genotypes in 2 breeding populations. Medium-coverage (12×) whole-genome sequences were obtained from a total of 410 fish representing 5 breeding populations with various spawning dates.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!