Background And Aims: Patient-reported outcomes (PROs) are vital in assessing disease activity and treatment outcomes in inflammatory bowel disease (IBD). However, manual extraction of these PROs from the free-text of clinical notes is burdensome. We aimed to improve data curation from free-text information in the electronic health record, making it more available for research and quality improvement. This study aimed to compare traditional natural language processing (tNLP) and large language models (LLMs) in extracting 3 IBD PROs (abdominal pain, diarrhea, fecal blood) from clinical notes across 2 institutions.

Methods: Clinic notes were annotated for each PRO using preset protocols. Models were developed and internally tested at the University of California, San Francisco, and then externally validated at Stanford University. We compared tNLP and LLM-based models on accuracy, sensitivity, specificity, positive, and negative predictive value. In addition, we conducted fairness and error assessments.

Results: Interrater reliability between annotators was >90%. On the University of California, San Francisco test set (n = 50), the top-performing tNLP models showcased accuracies of 92% (abdominal pain), 82% (diarrhea) and 80% (fecal blood), comparable to GPT-4, which was 96%, 88%, and 90% accurate, respectively. On external validation at Stanford (n = 250), tNLP models failed to generalize (61%-62% accuracy) while GPT-4 maintained accuracies >90%. Pathways Language Model-2 and Generative Pre-trained Transformer-4 showed similar performance. No biases were detected based on demographics or diagnosis.

Conclusion: LLMs are accurate and generalizable methods for extracting PROs. They maintain excellent accuracy across institutions, despite heterogeneity in note templates and authors. Widespread adoption of such tools has the potential to enhance IBD research and patient care.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11772946PMC
http://dx.doi.org/10.1016/j.gastha.2024.10.003DOI Listing

Publication Analysis

Top Keywords

large language
8
language models
8
traditional natural
8
natural language
8
language processing
8
methods extracting
8
patient-reported outcomes
8
outcomes inflammatory
8
inflammatory bowel
8
bowel disease
8

Similar Publications

Background: Fetal growth restriction (FGR) is a leading risk factor for stillbirth, yet the diagnosis of FGR confers considerable prognostic uncertainty, as most infants with FGR do not experience any morbidity. Our objective was to use data from a large, deeply phenotyped observational obstetric cohort to develop a probabilistic graphical model (PGM), a type of "explainable artificial intelligence (AI)", as a potential framework to better understand how interrelated variables contribute to perinatal morbidity risk in FGR.

Methods: Using data from 9,558 pregnancies delivered at ≥ 20 weeks with available outcome data, we derived and validated a PGM using randomly selected sub-cohorts of 80% (n = 7645) and 20% (n = 1,912), respectively, to discriminate cases of FGR resulting in composite perinatal morbidity from those that did not.

View Article and Find Full Text PDF

Background: Oropharyngeal dysphagia (dysphagia) is a common (up to 86%) and devastating syndrome in hospitalized older adults with dementia.

Objective: To describe the perspectives of dysphagia management in hospitalized patients with dementia among hospital medicine providers (i.e.

View Article and Find Full Text PDF

Visual diagnosis is one of the key features of squamous cell carcinoma of the oral cavity (OSCC) and oropharynx (OPSCC), both subsets of head and neck squamous cell carcinoma (HNSCC) with a heterogeneous clinical appearance. Advancements in artificial intelligence led to Image recognition being introduced recently into large language models (LLMs) such as ChatGPT 4.0.

View Article and Find Full Text PDF

TPepRet: a deep learning model for characterizing T cell receptors-antigen binding patterns.

Bioinformatics

January 2025

School of Computer Science and engineering, Central South University, Changsha, 410083, China.

Motivation: T-cell receptors (TCRs) elicit and mediate the adaptive immune response by recognizing antigenic peptides, a process pivotal for cancer immunotherapy, vaccine design, and autoimmune disease management. Understanding the intricate binding patterns between TCRs and peptides is critical for advancing these clinical applications. While several computational tools have been developed, they neglect the directional semantics inherent in sequence data, which are essential for accurately characterizing TCR-peptide interactions.

View Article and Find Full Text PDF

Bridging past and present: exploring Cannabis traditions in Armenia through ethnobotanical interviews and bibliographic prospecting.

J Cannabis Res

January 2025

Laboratori de Botànica (UB), Facultat de Farmàcia i Ciències de l'Alimentació-Institut de Recerca de la Biodiversitat (IRBio), Unitat Associada al CSIC, Universitat de Barcelona, Av. Joan XXIII 27-31, Barcelona, Catalonia, 08028, Spain.

Background: Cannabis sativa L. (Cannabaceae) has been widely used by humans throughout its history for a variety of purposes (medicinal, alimentary and other uses). Armenia, with its rich cultural history and diverse ecosystems, offers a unique context for ethnobotanical research about traditional uses of Cannabis.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!