Objectives: Expert abstraction of acute toxicities is critical in oncology research but is labor-intensive and variable. We assessed the accuracy of a natural language processing (NLP) pipeline to extract symptoms from clinical notes compared to physicians.

Materials And Methods: Two independent reviewers identified present and negated National Cancer Institute Common Terminology Criteria for Adverse Events (CTCAE) v5.0 symptoms from 100 randomly selected notes for on-treatment visits during radiation therapy with adjudication by a third reviewer. A NLP pipeline based on Apache clinical Text Analysis Knowledge Extraction System was developed and used to extract CTCAE terms. Accuracy was assessed by precision, recall, and F1.

Results: The NLP pipeline demonstrated high accuracy for common physician-abstracted symptoms, such as radiation dermatitis (F1 0.88), fatigue (0.85), and nausea (0.88). NLP had poor sensitivity for negated symptoms.

Conclusion: NLP accurately detects a subset of documented present CTCAE symptoms, though is limited for negated symptoms. It may facilitate strategies to more consistently identify toxicities during cancer therapy.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7886534PMC
http://dx.doi.org/10.1093/jamiaopen/ooaa064DOI Listing

Publication Analysis

Top Keywords

nlp pipeline
12
natural language
8
language processing
8
nlp
5
symptoms
5
processing abstraction
4
abstraction cancer
4
cancer treatment
4
treatment toxicities
4
accuracy
4

Similar Publications

Repeatable process for extracting health data from HL7 CDA documents.

J Biomed Inform

December 2024

Institute of Computer Science, University of Tartu, 51009 Tartu, Estonia; STACC, 51009 Tartu, Estonia.

Objective: This study aims to address the gap in the literature on converting real-world Clinical Document Architecture (CDA) data into the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), focusing on the initial steps preceding the mapping phase. We highlight the importance of a repeatable Extract-Transform-Load (ETL) pipeline for health data extraction from HL7 CDA documents in Estonia for research purposes.

Methods: We developed a repeatable ETL pipeline to facilitate the extraction, cleaning, and restructuring of health data from CDA documents to OMOP CDM, ensuring a high-quality and structured data format.

View Article and Find Full Text PDF

Objectives: Clinical trials (CTs) are essential for improving patient care by evaluating new treatments' safety and efficacy. A key component in CT protocols is the study population defined by the eligibility criteria. This study aims to evaluate the effectiveness of large language models (LLMs) in encoding eligibility criterion information to support CT-protocol design.

View Article and Find Full Text PDF

MicroRNAs (miRNAs) are small non-coding RNAs (18-26 nucleotides) that regulate gene expression by interacting with target mRNAs, affecting various physiological and pathological processes. miRTarBase, a database of experimentally validated miRNA-target interactions (MTIs), now features over 3 817 550 validated MTIs from 13 690 articles, significantly expanding its previous version. The updated database includes miRNA interactions with therapeutic agents, revealing roles in drug resistance and therapeutic strategies.

View Article and Find Full Text PDF
Article Synopsis
  • * The project focuses on utilizing Large Language Models (LLMs) to extract medical info from ambulance staff-patient dialogues to fill out emergency protocol forms, although there's a lack of established dialogue examples for evaluation.
  • * A pipeline was created using "Zephyr-7b-beta" for dialogue generation, followed by refinement with GPT-4 Turbo, which led to a high accuracy of 94% initially, slightly dropping to 87% after refinement; sentiment analysis showed improved positivity in dialogues post-refinement, emphasizing both the potential and challenges of using LLM
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!