AI Article Synopsis

  • The text discusses the challenge of accurately estimating identical-by-descent (IBD) segments in genetic studies due to the increasing size of genetic data, and presents a new method called the templated positional Burrows-Wheeler transform (TPBWT) that improves speed and accuracy in IBD inference despite errors in genotype and phasing.
  • The study shows that TPBWT outperforms existing IBD algorithms by analyzing simulation data and emphasizes the importance of haplotype quality for accurate IBD estimates, noting common error types.
  • Finally, the article highlights the practical applications of TPBWT in large datasets and includes a case study on geographic haplotype sharing in Mexico, making the software available for non-commercial use.

Article Abstract

Estimating the genomic location and length of identical-by-descent (IBD) segments among individuals is a crucial step in many genetic analyses. However, the exponential growth in the size of biobank and direct-to-consumer genetic data sets makes accurate IBD inference a significant computational challenge. Here we present the templated positional Burrows-Wheeler transform (TPBWT) to make fast IBD estimates robust to genotype and phasing errors. Using haplotype data simulated over pedigrees with realistic genotyping and phasing errors, we show that the TPBWT outperforms other state-of-the-art IBD inference algorithms in terms of speed and accuracy. For each phase-aware method, we explore the false positive and false negative rates of inferring IBD by segment length and characterize the types of error commonly found. Our results highlight the fragility of most phased IBD inference methods; the accuracy of IBD estimates can be highly sensitive to the quality of haplotype phasing. Additionally, we compare the performance of the TPBWT against a widely used phase-free IBD inference approach that is robust to phasing errors. We introduce both in-sample and out-of-sample TPBWT-based IBD inference algorithms and demonstrate their computational efficiency on massive-scale data sets with millions of samples. Furthermore, we describe the binary file format for TPBWT-compressed haplotypes that results in fast and efficient out-of-sample IBD computes against very large cohort panels. Finally, we demonstrate the utility of the TPBWT in a brief empirical analysis, exploring geographic patterns of haplotype sharing within Mexico. Hierarchical clustering of IBD shared across regions within Mexico reveals geographically structured haplotype sharing and a strong signal of isolation by distance. Our software implementation of the TPBWT is freely available for noncommercial use in the code repository (https://github.com/23andMe/phasedibd, last accessed January 11, 2021).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8097300PMC
http://dx.doi.org/10.1093/molbev/msaa328DOI Listing

Publication Analysis

Top Keywords

ibd inference
20
phasing errors
12
ibd
11
templated positional
8
positional burrows-wheeler
8
burrows-wheeler transform
8
data sets
8
ibd estimates
8
inference algorithms
8
haplotype sharing
8

Similar Publications

Estimating realized relatedness in free-ranging macaques by inferring identity-by-descent segments.

Proc Natl Acad Sci U S A

January 2025

Department of Primate Behavioral Ecology, Institute of Biology, Leipzig University, Leipzig 04103, Germany.

Biological relatedness is a key consideration in studies of behavior, population structure, and trait evolution. Except for parent-offspring dyads, pedigrees capture relatedness imperfectly. The number and length of identical-by-descent DNA segments (IBD) yield the most precise relatedness estimates.

View Article and Find Full Text PDF

Genetic correlation, pleiotropic loci and shared risk genes between major depressive disorder and gastrointestinal tract disorders.

J Affect Disord

January 2025

Healthy Food Evaluation Research Center, West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China; Food Safety Monitoring and Risk Assessment Key Laboratory of Sichuan Province, Chengdu 610041, China. Electronic address:

Background: Major depressive disorder (MDD) is associated with gastrointestinal tract (GIT) disorders, while genetic correlation, pleiotropic loci and shared risk genes remain to be explored.

Methods: Leveraging genome-wide association study statistics for MDD (n = 170,756), peptic ulcer disease (PUD; n = 16,666), gastroesophageal reflux disease (GORD; n = 54,854), PUD and/or GORD and/or medications (PGM; n = 90,175), irritable bowel syndrome (IBS; n = 28,518), and inflammatory bowel disease (IBD; n = 7045), we determined global and local genetic correlations, identified pleiotropic loci, performed gene-level evaluations, and inferred causal associations using bidirectional Mendelian randomization.

Results: We found global correlation of MDD with PUD (r = 0.

View Article and Find Full Text PDF

Estimating evolutionary and demographic parameters via ARG-derived IBD.

PLoS Genet

January 2025

Melbourne Integrative Genomics, School of Mathematics & Statistics, University of Melbourne, Victoria, Australia.

Inference of evolutionary and demographic parameters from a sample of genome sequences often proceeds by first inferring identical-by-descent (IBD) genome segments. By exploiting efficient data encoding based on the ancestral recombination graph (ARG), we obtain three major advantages over current approaches: (i) no need to impose a length threshold on IBD segments, (ii) IBD can be defined without the hard-to-verify requirement of no recombination, and (iii) computation time can be reduced with little loss of statistical efficiency using only the IBD segments from a set of sequence pairs that scales linearly with sample size. We first demonstrate powerful inferences when true IBD information is available from simulated data.

View Article and Find Full Text PDF

Background: Ulcerative colitis (UC) is a significant inflammatory bowel disease (IBD) that typically arises from chronic inflammation of the intestinal tract. Report suggest that anti-inflammatory drug plays a crucial role in the protection of UC. The recent study demonstrated that columbianadin has a protective effect against UC induced by dextran sulfate sodium (DSS) in rats through the modulation of HO-1/Nrf2 and TLR4-NF-κB signaling pathways.

View Article and Find Full Text PDF

Background: Autoimmune disorders have primary manifestations such as joint pain and bowel inflammation but can also have secondary manifestations such as non-infectious uveitis (NIU). A regulatory health authority raised concerns after receiving spontaneous reports for NIU following exposure to Remicade, a biologic therapy with multiple indications for which alternative therapies are available. In assessment of this clinical question, we applied validity diagnostics to support observational data causal inferences.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!