Introduction: The exponential growth of genomic datasets necessitates advanced analytical tools to effectively identify genetic loci from large-scale high throughput sequencing data. This study presents Deep-Block, a multi-stage deep learning framework that incorporates biological knowledge into its AI architecture to identify genetic regions as significantly associated with Alzheimer's disease (AD). The framework employs a three-stage approach: (1) genome segmentation based on linkage disequilibrium (LD) patterns, (2) selection of relevant LD blocks using sparse attention mechanisms, and (3) application of TabNet and Random Forest algorithms to quantify single nucleotide polymorphism (SNP) feature importance, thereby identifying genetic factors contributing to AD risk.
Methods: The Deep-Block was applied to a large-scale whole genome sequencing (WGS) dataset from the Alzheimer's Disease Sequencing Project (ADSP), comprising 7416 non-Hispanic white (NHW) participants (3150 cognitively normal older adults (CN), 4266 AD).
Results: 30,218 LD blocks were identified and then ranked based on their relevance with Alzheimer's disease. Subsequently, the Deep-Block identified novel SNPs within the top 1500 LD blocks and confirmed previously known variants, including rs429358 and rs769449. Expression Quantitative Trait Loci (eQTL) analysis across 13 brain regions provided functional evidence for the identified variants. The results were cross-validated against established AD-associated loci from the European Alzheimer's and Dementia Biobank (EADB) and the GWAS catalog.
Discussion: The Deep-Block framework effectively processes large-scale high throughput sequencing data while preserving SNP interactions during dimensionality reduction, minimizing bias and information loss. The framework's findings are supported by tissue-specific eQTL evidence across brain regions, indicating the functional relevance of the identified variants. Additionally, the Deep-Block approach has identified both known and novel genetic variants, enhancing our understanding of the genetic architecture and demonstrating its potential for application in large-scale sequencing studies.
Highlights: Growing genomic datasets require advanced tools to identify genetic loci in sequencing.Deep-Block, a novel AI framework, was used to process large-scale ADSP WGS data.Deep-Block identified both known and novel AD-associated genetic loci.rs429358 () was key; rs11556505 (), rs34342646 () were significant.The AI framework uses biological knowledge to enhance detection of Alzheimer's loci.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11736638 | PMC |
http://dx.doi.org/10.1002/trc2.70041 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!