Selective UMLS knowledge infusion for biomedical question answering.

Sci Rep

Integrated Major in Innovative Medical Science, Seoul National University Graduate School, Seoul, Republic of Korea.

Published: August 2023

AI Article Synopsis

  • The study explores a method to efficiently integrate biomedical knowledge into language models to enhance biomedical question-answering.
  • It focuses on using adapters to infuse key knowledge from the Unified Medical Language System without needing to transfer the entire semantics of the knowledge graph.
  • The results indicate that partitioning the knowledge graph for pretraining leads to better performance and efficiency, especially when selectively discarding smaller groups or merging larger ones, while remaining mostly unaffected by how the groups are formulated.

Article Abstract

One of the artificial intelligence applications in the biomedical field is knowledge-intensive question-answering. As domain expertise is particularly crucial in this field, we propose a method for efficiently infusing biomedical knowledge into pretrained language models, ultimately targeting biomedical question-answering. Transferring all semantics of a large knowledge graph into the entire model requires too many parameters, increasing computational cost and time. We investigate an efficient approach that leverages adapters to inject Unified Medical Language System knowledge into pretrained language models, and we question the need to use all semantics in the knowledge graph. This study focuses on strategies of partitioning knowledge graph and either discarding or merging some for more efficient pretraining. According to the results of three biomedical question answering finetuning datasets, the adapters pretrained on semantically partitioned group showed more efficient performance in terms of evaluation metrics, required parameters, and time. The results also show that discarding groups with fewer concepts is a better direction for small datasets, and merging these groups is better for large dataset. Furthermore, the metric results show a slight improvement, demonstrating that the adapter methodology is rather insensitive to the group formulation.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10468517PMC
http://dx.doi.org/10.1038/s41598-023-41423-8DOI Listing

Publication Analysis

Top Keywords

knowledge graph
12
biomedical question
8
question answering
8
knowledge pretrained
8
pretrained language
8
language models
8
knowledge
6
biomedical
5
selective umls
4
umls knowledge
4

Similar Publications

The TOXIN knowledge graph: supporting animal-free risk assessment of cosmetics.

Database (Oxford)

January 2025

Department of In Vitro Toxicology and Dermato-Cosmetology (IVTD), Vrije Universiteit Brussel, Laarbeeklaan 103, Brussels 1090, Belgium.

The European Union's ban on animal testing for cosmetic products and their ingredients, combined with the lack of validated animal-free methods, poses challenges in evaluating their potential repeated-dose organ toxicity. To address this, innovative strategies like Next-Generation Risk Assessment (NGRA) are being explored, integrating historical animal data with new mechanistic insights from non-animal New Approach Methodologies (NAMs). This paper introduces the TOXIN knowledge graph (TOXIN KG), a tool designed to retrieve toxicological information on cosmetic ingredients, with a focus on liver-related data.

View Article and Find Full Text PDF

An Automated Approach for Domain-Specific Knowledge Graph Generation─Graph Measures and Characterization.

J Chem Inf Model

January 2025

Center for Engineering Concepts Development, Department of Mechanical Engineering, University of Maryland, College Park, Maryland 20742, United States.

In 2020, nearly 3 million scientific and engineering papers were published worldwide (White, K. Publications Output: U.S.

View Article and Find Full Text PDF

Identifying informative low-dimensional features that characterize dynamics in molecular simulations remains a challenge, often requiring extensive manual tuning and system-specific knowledge. Here, we introduce geom2vec, in which pretrained graph neural networks (GNNs) are used as universal geometric featurizers. By pretraining equivariant GNNs on a large dataset of molecular conformations with a self-supervised denoising objective, we obtain transferable structural representations that are useful for learning conformational dynamics without further fine-tuning.

View Article and Find Full Text PDF

The growing demand for biological products drives many efforts to maximize expression of heterologous proteins. Advances in high-throughput sequencing can produce data suitable for building sequence-to-expression models with machine learning. The most accurate models have been trained on one-hot encodings, a mechanism-agnostic representation of nucleotide sequences.

View Article and Find Full Text PDF

Background: Recent research indicates that the intestinal microbial community, known as the gut microbiota, may play a crucial role in the pathogenesis of nonalcoholic fatty liver disease (NAFLD). To understand this relationship, this study used a comprehensive bibliometric analysis to explore and analyze the currently little-known connection between gut microbiota and NAFLD, as well as new findings and possible future pathways in this field.

Aim: To provide an in-depth analysis of the current focus issues and research developments on the interaction between gut microbiota and NAFLD.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!