Biomedical text readability after hypernym substitution with fine-tuned large language models.

Karl Swanson Shuhan He Josh Calvano David Chen Talar Telvizian Lawrence Jiang Paul Chong Jacob Schwell Gin Mak Jarone Lee

PLOS Digit Health

Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States of America.

Published: April 2024

The advent of patient access to complex medical information online has highlighted the need for simplification of biomedical text to improve patient understanding and engagement in taking ownership of their health. However, comprehension of biomedical text remains a difficult task due to the need for domain-specific expertise. We aimed to study the simplification of biomedical text via large language models (LLMs) commonly used for general natural language processing tasks involve text comprehension, summarization, generation, and prediction of new text from prompts. Specifically, we finetuned three variants of large language models to perform substitutions of complex words and word phrases in biomedical text with a related hypernym. The output of the text substitution process using LLMs was evaluated by comparing the pre- and post-substitution texts using four readability metrics and two measures of sentence complexity. A sample of 1,000 biomedical definitions in the National Library of Medicine's Unified Medical Language System (UMLS) was processed with three LLM approaches, and each showed an improvement in readability and sentence complexity after hypernym substitution. Readability scores were translated from a pre-processed collegiate reading level to a post-processed US high-school level. Comparison between the three LLMs showed that the GPT-J-6b approach had the best improvement in measures of sentence complexity. This study demonstrates the merit of hypernym substitution to improve readability of complex biomedical text for the public and highlights the use case for fine-tuning open-access large language models for biomedical natural language processing.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11020904	PMC
http://dx.doi.org/10.1371/journal.pdig.0000489	DOI Listing

Publication Analysis

Top Keywords

biomedical text

large language

language models

hypernym substitution

sentence complexity

biomedical

simplification biomedical

text

natural language

language processing

Similar Publications

Technology and Dementia Preconference.

Alzheimers Dement

December 2024

UT Health San Antonio, San Antonio, TX, USA.

Lokesha Pugalenthi Heather R Dial J Jessy Li Suzanne Schmitz Maya L Henry

Background: Primary progressive aphasia (PPA) is a language-led dementia associated with underlying Alzheimer's disease (AD) or frontotemporal lobar degeneration pathology. As part of the Alzheimer's spectrum, logopenic (lv) PPA may be particularly difficult to distinguish from amnestic AD, due to overlapping clinical features. Analysis of linguistic and acoustic variables derived from connected speech has shown promise as a diagnostic tool for differentiating dementia subtypes.

View Article and Find Full Text PDF

Similar Publications

Effects and mechanisms of computerized cognitive training in Huntington's disease: protocol for a pilot study.

Neurodegener Dis Manag

January 2025

Turner Institute for Brain & Mental Health, School of Psychological Sciences, Faculty of Medicine, Nursing & Health Sciences, 18 Innovation Walk, Monash University, Clayton VIC 3800, Australia.

Katharine Huynh Sharna Jamadar Julie Stout Katharina Voigt Amit Lampit

Huntington's disease (HD) causes progressive cognitive decline, with no available treatments. Computerized cognitive training (CCT) has shown efficacy in other populations, but its effects in HD are largely unknown. This pilot study will explore the effects and neural mechanisms of CCT in HD.

View Article and Find Full Text PDF

Similar Publications

Sex and gender in perioperative cardiovascular research: protocol for a scoping review.

Syst Rev

January 2025

Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada.

Ana Sjaus Nicole Fakhory

Background: The inadequate inclusion of sex and gender in medical research has resulted in biased clinical guidance and disparities in knowledge and patient outcomes. Despite efforts by regulatory and funding agencies, opportunities to generate sex-specific knowledge are frequently overlooked. While certain disciplines in cardiovascular medicine have made notable progress, these advances have yet to permeate the literature on perioperative cardiovascular complications in non-cardiac surgery.

View Article and Find Full Text PDF

Similar Publications

The Venus score for the assessment of the quality and trustworthiness of biomedical datasets.

BioData Min

January 2025

Fondazione Bruno Kessler, Trento, Italy.

Davide Chicco Alessandro Fabris Giuseppe Jurman

Biomedical datasets are the mainstays of computational biology and health informatics projects, and can be found on multiple data platforms online or obtained from wet-lab biologists and physicians. The quality and the trustworthiness of these datasets, however, can sometimes be poor, producing bad results in turn, which can harm patients and data subjects. To address this problem, policy-makers, researchers, and consortia have proposed diverse regulations, guidelines, and scores to assess the quality and increase the reliability of datasets.

View Article and Find Full Text PDF

Similar Publications

A vision-language foundation model for precision oncology.

Nature

January 2025

Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA.

Jinxi Xiang Xiyue Wang Xiaoming Zhang Yinghua Xi Feyisope Eweje

Clinical decision-making is driven by multimodal data, including clinical notes and pathological characteristics. Artificial intelligence approaches that can effectively integrate multimodal data hold significant promise in advancing clinical care. However, the scarcity of well-annotated multimodal datasets in clinical settings has hindered the development of useful models.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!