Basic Science and Pathogenesis.

Alzheimers Dement

Penn Neurodegeneration Genomics Center, Dept of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA, USA.

Published: December 2024

AI Article Synopsis

  • NIAGADS is a national data repository providing access to genomic data related to Alzheimer's disease, focusing on curating and standardizing information from various sources for researchers.
  • To improve data navigation, NIAGADS has developed AI enhancements using large language models trained on its documentation and API, which help users with complex queries and facilitate data discovery.
  • These AI improvements include rule-based chatbots and generative search tools that respond to user inquiries, learn from feedback, and assist in common issues, ultimately enhancing the overall user experience on NIAGADS platforms.

Article Abstract

Background: NIAGADS is a national data repository that offers qualified investigators access to genomic data for Alzheimer's disease (AD) and related dementia. In addition, NIAGADS has made substantial effort to curate, harmonize, standardize, and disseminate AD-relevant variant, gene, and sequence annotations from publications, functional genomics datasets, and summary statistics deposited at NIAGADS. These results are made available to the public in a collection of interactive knowledgebases (AD Variant Portal, FILER Functional Genomics Repository, VariXam, Alzheimer's GenomicsDB & Genome Browser), all of which are accessible programmatically via the NIAGADS API. However, as these offerings grow, navigating them can be challenging. Here, we introduce AI-based enhancements to NIAGADS sites to help guide researchers and facilitate data discovery.

Method: We leverage OpenAI's generative AI to build and train three large language models (LLMs) based on NIAGADS documentation, step-by-step recipes for data-access requests, subject-specific vocabularies, and the OpenAPI specification defining the NIAGADS API that allows programmatic access to the NIAGADS knowledgebases. For users of the API and to enhance search interfaces, we build on the LLMs to construct a framework for handling complex natural language instructions that decomposes an inquiry into tasks and subtasks and then plans, selects, and optionally executes API calls and parses the results.

Result: Developing these LLMs allows NIAGADS to improve user experiences by integrating topic-specific chatbots and generative AI search tools into NIAGADS sites. Rule-based chatbots that leverage conversational AI on the NIAGADS portal and Data Sharing Service will respond to inquiries with answers inferred from the LLMs, with responses improving with user feedback. These bots will also supplement help requests, suggesting solutions to common inquiries. Planner-enhanced generative AI based on the API-specification trained LLMs will be tied to knowledgebase searches and filters in resources such as the GenomicsDB and FILER to allow users to leverage natural language processing to ask sophisticated questions that require multiple API calls to resolve the answer.

Conclusion: Introducing AI-enhanced search creates an interactive opportunity for NIAGADS users to learn new information or discover resources and tools they can use to supplement their research, which, in turn, improves NIAGADS ability to support AD genetics research.

Download full-text PDF

Source
http://dx.doi.org/10.1002/alz.092685DOI Listing

Publication Analysis

Top Keywords

niagads
13
functional genomics
8
niagads api
8
niagads sites
8
natural language
8
api calls
8
api
5
llms
5
basic science
4
science pathogenesis
4

Similar Publications

Dementia refers to an umbrella phenotype of many different underlying pathologies with Alzheimer's disease (AD) being the most common type. Neuropathological examination remains the gold standard for accurate AD diagnosis, however, most that we know about AD genetics is based on Genome-Wide Association Studies (GWAS) of clinically defined AD. Such studies have identified multiple AD susceptibility variants with a significant portion of the heritability unexplained and highlighting the phenotypic and genetic heterogeneity of the clinically defined entity.

View Article and Find Full Text PDF

Complex genetic interactions affect susceptibility to Alzheimer's disease risk in the BIN1 and MS4A6A loci.

Geroscience

January 2025

Biodemography of Aging Research Unit, Social Science Research Institute, Duke University, Erwin Mill Building, 2024 W. Main St, Durham, NC, 27705, USA.

Genetics is the second strongest risk factor for Alzheimer's disease (AD) after age. More than 70 loci have been implicated in AD susceptibility so far, and the genetic architecture of AD entails both additive and nonadditive contributions from these loci. To better understand nonadditive impact of single-nucleotide polymorphisms (SNPs) on AD risk, we examined individual, joint, and interacting (SNPxSNP) effects of 139 and 66 SNPs mapped to the BIN1 and MS4A6A AD-associated loci, respectively.

View Article and Find Full Text PDF

Background: NIAGADS is a national genomics data repository that facilitates access of genotypic and sequencing data to qualified investigators for the study of the genetics of Alzheimer's disease (AD) and related neurological diseases. Collaborations with large consortia and centers such as the Alzheimer's Disease Genetics Consortium (ADGC), Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium, the Alzheimer's Disease Sequencing Project (ADSP), and the Genome Center for Alzheimer's Disease (GCAD) allow NIAGADS to lead the effort in managing large AD datasets that can be easily accessed and fully utilized by the research community.

Method: NIAGADS is supported by the National Institute on Aging (NIA) under a cooperative agreement.

View Article and Find Full Text PDF

Basic Science and Pathogenesis.

Alzheimers Dement

December 2024

Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

Background: The Genome Center for Alzheimer's Disease (GCAD) coordinates the integration and meta-analysis of all available Alzheimer's disease (AD) relevant whole genome sequencing (WGS) data to facilitate the goal of identifying AD risk or protective genetic variants and eventual therapeutic targets. The WGS datasets are generated via the collaboration of scientists from the Alzheimer's Disease Sequencing Project (ADSP) and GCAD. To minimize data heterogeneity introduced by different sequencing protocols and machines, GCAD processes all samples using identical pipelines.

View Article and Find Full Text PDF

Background: The Alzheimer's Disease Sequencing Project (ADSP) aims to identify genetic variation contributing to the development or protection of Alzheimer's disease (AD) in diverse ancestral populations. The latest ADSP whole genome sequencing (WGS) data release includes over 36,000 individuals from 37 datasets (NIAGADS NG00067.v11 ADSP R4).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!