Estimating the genomic location and length of identical-by-descent (IBD) segments among individuals is a crucial step in many genetic analyses. However, the exponential growth in the size of biobank and direct-to-consumer genetic data sets makes accurate IBD inference a significant computational challenge. Here we present the templated positional Burrows-Wheeler transform (TPBWT) to make fast IBD estimates robust to genotype and phasing errors. Using haplotype data simulated over pedigrees with realistic genotyping and phasing errors, we show that the TPBWT outperforms other state-of-the-art IBD inference algorithms in terms of speed and accuracy. For each phase-aware method, we explore the false positive and false negative rates of inferring IBD by segment length and characterize the types of error commonly found. Our results highlight the fragility of most phased IBD inference methods; the accuracy of IBD estimates can be highly sensitive to the quality of haplotype phasing. Additionally, we compare the performance of the TPBWT against a widely used phase-free IBD inference approach that is robust to phasing errors. We introduce both in-sample and out-of-sample TPBWT-based IBD inference algorithms and demonstrate their computational efficiency on massive-scale data sets with millions of samples. Furthermore, we describe the binary file format for TPBWT-compressed haplotypes that results in fast and efficient out-of-sample IBD computes against very large cohort panels. Finally, we demonstrate the utility of the TPBWT in a brief empirical analysis, exploring geographic patterns of haplotype sharing within Mexico. Hierarchical clustering of IBD shared across regions within Mexico reveals geographically structured haplotype sharing and a strong signal of isolation by distance. Our software implementation of the TPBWT is freely available for noncommercial use in the code repository (https://github.com/23andMe/phasedibd, last accessed January 11, 2021).
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8097300 | PMC |
http://dx.doi.org/10.1093/molbev/msaa328 | DOI Listing |
Proc Natl Acad Sci U S A
January 2025
Department of Primate Behavioral Ecology, Institute of Biology, Leipzig University, Leipzig 04103, Germany.
Biological relatedness is a key consideration in studies of behavior, population structure, and trait evolution. Except for parent-offspring dyads, pedigrees capture relatedness imperfectly. The number and length of identical-by-descent DNA segments (IBD) yield the most precise relatedness estimates.
View Article and Find Full Text PDFJ Affect Disord
January 2025
Healthy Food Evaluation Research Center, West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China; Food Safety Monitoring and Risk Assessment Key Laboratory of Sichuan Province, Chengdu 610041, China. Electronic address:
Background: Major depressive disorder (MDD) is associated with gastrointestinal tract (GIT) disorders, while genetic correlation, pleiotropic loci and shared risk genes remain to be explored.
Methods: Leveraging genome-wide association study statistics for MDD (n = 170,756), peptic ulcer disease (PUD; n = 16,666), gastroesophageal reflux disease (GORD; n = 54,854), PUD and/or GORD and/or medications (PGM; n = 90,175), irritable bowel syndrome (IBS; n = 28,518), and inflammatory bowel disease (IBD; n = 7045), we determined global and local genetic correlations, identified pleiotropic loci, performed gene-level evaluations, and inferred causal associations using bidirectional Mendelian randomization.
Results: We found global correlation of MDD with PUD (r = 0.
PLoS Genet
January 2025
Melbourne Integrative Genomics, School of Mathematics & Statistics, University of Melbourne, Victoria, Australia.
Inference of evolutionary and demographic parameters from a sample of genome sequences often proceeds by first inferring identical-by-descent (IBD) genome segments. By exploiting efficient data encoding based on the ancestral recombination graph (ARG), we obtain three major advantages over current approaches: (i) no need to impose a length threshold on IBD segments, (ii) IBD can be defined without the hard-to-verify requirement of no recombination, and (iii) computation time can be reduced with little loss of statistical efficiency using only the IBD segments from a set of sequence pairs that scales linearly with sample size. We first demonstrate powerful inferences when true IBD information is available from simulated data.
View Article and Find Full Text PDFInflammopharmacology
January 2025
General Surgery Department, Shanxi Bethune Hospital, Shanxi Academy of Medical Sciences, Tongji Shanxi Hospital, Third Hospital of Shanxi Medical University, Taiyuan, 030032, China.
Background: Ulcerative colitis (UC) is a significant inflammatory bowel disease (IBD) that typically arises from chronic inflammation of the intestinal tract. Report suggest that anti-inflammatory drug plays a crucial role in the protection of UC. The recent study demonstrated that columbianadin has a protective effect against UC induced by dextran sulfate sodium (DSS) in rats through the modulation of HO-1/Nrf2 and TLR4-NF-κB signaling pathways.
View Article and Find Full Text PDFBMC Med Res Methodol
December 2024
Janssen Research & Development LLC, Global Epidemiology Organization, Raritan, NJ, USA.
Background: Autoimmune disorders have primary manifestations such as joint pain and bowel inflammation but can also have secondary manifestations such as non-infectious uveitis (NIU). A regulatory health authority raised concerns after receiving spontaneous reports for NIU following exposure to Remicade, a biologic therapy with multiple indications for which alternative therapies are available. In assessment of this clinical question, we applied validity diagnostics to support observational data causal inferences.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!