On the role of the UMLS in supporting diagnosis generation proposed by Large Language Models.

J Biomed Inform

National Library of Medicine, NIH, HHS, 8600 Rockville Pike, Bethesda, 20894, MD, USA.

Published: September 2024

Objective: Traditional knowledge-based and machine learning diagnostic decision support systems have benefited from integrating the medical domain knowledge encoded in the Unified Medical Language System (UMLS). The emergence of Large Language Models (LLMs) to supplant traditional systems poses questions of the quality and extent of the medical knowledge in the models' internal knowledge representations and the need for external knowledge sources. The objective of this study is three-fold: to probe the diagnosis-related medical knowledge of popular LLMs, to examine the benefit of providing the UMLS knowledge to LLMs (grounding the diagnosis predictions), and to evaluate the correlations between human judgments and the UMLS-based metrics for generations by LLMs.

Methods: We evaluated diagnoses generated by LLMs from consumer health questions and daily care notes in the electronic health records using the ConsumerQA and Problem Summarization datasets. Probing LLMs for the UMLS knowledge was performed by prompting the LLM to complete the diagnosis-related UMLS knowledge paths. Grounding the predictions was examined in an approach that integrated the UMLS graph paths and clinical notes in prompting the LLMs. The results were compared to prompting without the UMLS paths. The final experiments examined the alignment of different evaluation metrics, UMLS-based and non-UMLS, with human expert evaluation.

Results: In probing the UMLS knowledge, GPT-3.5 significantly outperformed Llama2 and a simple baseline yielding an F1 score of 10.9% in completing one-hop UMLS paths for a given concept. Grounding diagnosis predictions with the UMLS paths improved the results for both models on both tasks, with the highest improvement (4%) in SapBERT score. There was a weak correlation between the widely used evaluation metrics (ROUGE and SapBERT) and human judgments.

Conclusion: We found that while popular LLMs contain some medical knowledge in their internal representations, augmentation with the UMLS knowledge provides performance gains around diagnosis generation. The UMLS needs to be tailored for the task to improve the LLMs predictions. Finding evaluation metrics that are aligned with human judgments better than the traditional ROUGE and BERT-based scores remains an open research question.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11402555PMC
http://dx.doi.org/10.1016/j.jbi.2024.104707DOI Listing

Publication Analysis

Top Keywords

umls knowledge
20
medical knowledge
12
umls paths
12
evaluation metrics
12
knowledge
11
umls
11
diagnosis generation
8
large language
8
language models
8
llms
8

Similar Publications

Objective: The objectives of this study are to synthesize findings from recent research of retrieval-augmented generation (RAG) and large language models (LLMs) in biomedicine and provide clinical development guidelines to improve effectiveness.

Materials And Methods: We conducted a systematic literature review and a meta-analysis. The report was created in adherence to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses 2020 analysis.

View Article and Find Full Text PDF

Background And Objective: Despite significant investments in the normalization and the standardization of Electronic Health Records (EHRs), free text is still the rule rather than the exception in clinical notes. The use of free text has implications in data reuse methods used for supporting clinical research since the query mechanisms used in cohort definition and patient matching are mainly based on structured data and clinical terminologies. This study aims to develop a method for the secondary use of clinical text by: (a) using Natural Language Processing (NLP) for tagging clinical notes with biomedical terminology; and (b) designing an ontology that maps and classifies all the identified tags to various terminologies and allows for running phenotyping queries.

View Article and Find Full Text PDF

Natural language processing data services for healthcare providers.

BMC Med Inform Decis Mak

November 2024

CogStack, Guys and St Thomas NHS Trust, London, UK.

Article Synopsis
  • The review emphasizes the necessity of integrating machine learning workflows into hospital settings to align with clinical practices and real-world data.
  • The paper discusses the development and implementation of a novel clinical NLP service within the UK's National Health Service, focusing on creating a framework to incorporate expert clinical insight into NLP models.
  • The project has generated over 26,000 annotations and demonstrated various clinical uses of named entity recognition, suggesting that NLP services will soon be essential in healthcare.
View Article and Find Full Text PDF
Article Synopsis
  • Kidney stone disease (KSD) is a growing urological issue, and this paper aims to create a knowledge graph to improve the access and understanding of KSD-related information for medical professionals.
  • Text from PubMed was analyzed and integrated with various public databases to build a large-scale Kidney Stone Disease Knowledge Graph (KSDKG), which includes over 90 million data points derived from nearly 30,000 articles.
  • The developed KSDKG demonstrates its utility through case studies, revealing new clinical insights and enabling better understanding of the connections between microbes, drugs, and diseases related to KSD, ultimately contributing significantly to medical research.
View Article and Find Full Text PDF

The growth of biomedical literature presents challenges in extracting and structuring knowledge. Knowledge Graphs (KGs) offer a solution by representing relationships between biomedical entities. However, manual construction of KGs is labor-intensive and time-consuming, highlighting the need for automated methods.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!