AI Article Synopsis

  • - Methods for linking regulatory elements to genes using single-cell RNA-seq and ATAC-seq multiome data have varied success due to differences in handling noise and genomic distance, leading to inconsistent results.
  • - The new framework, pgBoost, combines multiple linking strategies and incorporates genomic distance, resulting in significantly better identification of SNP-gene relationships, especially over larger distances compared to existing methods.
  • - pgBoost demonstrates superior performance in connecting SNPs to genes validated by CRISPR and GWAS data, revealing important links that other methods failed to identify, especially when focused on specific cell types.

Article Abstract

Methods that analyze single-cell paired RNA-seq and ATAC-seq multiome data have shown great promise in linking regulatory elements to genes. However, existing methods differ in their modeling assumptions and approaches to account for biological and technical noise-leading to low concordance in their linking scores-and do not capture the effects of genomic distance. We propose pgBoost, an integrative modeling framework that trains a non-linear combination of existing linking strategies (including genomic distance) on fine-mapped eQTL data to assign a probabilistic score to each candidate SNP-gene link. We applied pgBoost to single-cell multiome data from 85k cells representing 6 major immune/blood cell types. pgBoost attained higher enrichment for fine-mapped eSNP-eGene pairs (e.g. 21x at distance >10kb) than existing methods (1.2-10x; p-value for difference = 5e-13 vs. distance-based method and < 4e-35 for each other method), with larger improvements at larger distances (e.g. 35x vs. 0.89-6.6x at distance >100kb; p-value for difference < 0.002 vs. each other method). pgBoost also outperformed existing methods in enrichment for CRISPR-validated links (e.g. 4.8x vs. 1.6-4.1x at distance >10kb; p-value for difference = 0.25 vs. distance-based method and < 2e-5 for each other method), with larger improvements at larger distances (e.g. 15x vs. 1.6-2.5x at distance >100kb; p-value for difference < 0.009 for each other method). Similar improvements in enrichment were observed for links derived from Activity-By-Contact (ABC) scores and GWAS data. We further determined that restricting pgBoost to features from a focal cell type improved the identification of SNP-gene links relevant to that cell type. We highlight several examples where pgBoost linked fine-mapped GWAS variants to experimentally validated or biologically plausible target genes that were not implicated by other methods. In conclusion, a non-linear combination of linking strategies, including genomic distance, improves power to identify target genes underlying GWAS associations.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11142273PMC
http://dx.doi.org/10.1101/2024.05.24.24307813DOI Listing

Publication Analysis

Top Keywords

genomic distance
16
p-value difference
16
target genes
12
existing methods
12
linking regulatory
8
single-cell multiome
8
distance
8
multiome data
8
non-linear combination
8
linking strategies
8

Similar Publications

Background: This cross-sectional study aimed to compare the composition of the submucosal microbiome of peri-implantitis with paired and unpaired healthy implant samples.

Methods: We evaluated submucosal plaque samples obtained in 39 cases, including 13 cases of peri-implantitis, 13 cases involving healthy implants from the same patient (paired samples), and 13 cases involving healthy implants from different individuals (unpaired samples). The patients were evaluated using next-generation genomic sequencing (Illumina) based on 16S rRNA gene amplification.

View Article and Find Full Text PDF

Partner fidelity, not geography, drives co-diversification of gut microbiota with hominids.

Biol Lett

January 2025

Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA.

Bacterial strains that inhabit the gastrointestinal tracts of hominids have diversified in parallel (co-diversified) with their host species. The extent to which co-diversification has been mediated by partner fidelity between strains and hosts or by geographical distance between hosts is not clear due to a lack of strain-level data from clades of hosts with unconfounded phylogenetic relationships and geographical distributions. Here, I tested these competing hypotheses through meta-analyses of 7121 gut bacterial genomes assembled from wild-living ape species and subspecies sampled throughout their ranges in equatorial Africa.

View Article and Find Full Text PDF

Background/purpose: Burning moouth syndrome (BMS) is a chronic pain condition similar to neuropathic pain. It is characterized by a persistent burning sensation in the oral cavity. Despite the lack of clarity regarding the etiology of BMS, recent studies have reported an association between the gut microbiome and neuropathic pain.

View Article and Find Full Text PDF

Genomic Epidemiology of Strains That Caused the Fire Blight Outbreak in Korea.

Plant Dis

January 2025

50 Yonsei-ro, Seodaemun-guSeoul, Korea (the Republic of), 03722;

Fire blight, a devastating bacterial disease affecting rosaceous plants such as apples and pears, is caused by . The disease, known for its rapid spread and destructive potential, can lead to severe symptoms and often result in the death of infected plants. In Korea, the observation of was first recorded in 2015, and subsequent dissemination has been noted across the peninsula.

View Article and Find Full Text PDF

India harbours a substantial population of 9.43 million dogs, showcasing diverse phenotypes and utility. Initiatives focusing on awareness, conservation and informed breeding can greatly enhance the recognition and welfare of the unique Indian canine heritage.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!