Objectives: Expert abstraction of acute toxicities is critical in oncology research but is labor-intensive and variable. We assessed the accuracy of a natural language processing (NLP) pipeline to extract symptoms from clinical notes compared to physicians.
Materials And Methods: Two independent reviewers identified present and negated National Cancer Institute Common Terminology Criteria for Adverse Events (CTCAE) v5.0 symptoms from 100 randomly selected notes for on-treatment visits during radiation therapy with adjudication by a third reviewer. A NLP pipeline based on Apache clinical Text Analysis Knowledge Extraction System was developed and used to extract CTCAE terms. Accuracy was assessed by precision, recall, and F1.
Results: The NLP pipeline demonstrated high accuracy for common physician-abstracted symptoms, such as radiation dermatitis (F1 0.88), fatigue (0.85), and nausea (0.88). NLP had poor sensitivity for negated symptoms.
Conclusion: NLP accurately detects a subset of documented present CTCAE symptoms, though is limited for negated symptoms. It may facilitate strategies to more consistently identify toxicities during cancer therapy.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7886534 | PMC |
http://dx.doi.org/10.1093/jamiaopen/ooaa064 | DOI Listing |
J Biomed Inform
December 2024
Institute of Computer Science, University of Tartu, 51009 Tartu, Estonia; STACC, 51009 Tartu, Estonia.
Objective: This study aims to address the gap in the literature on converting real-world Clinical Document Architecture (CDA) data into the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), focusing on the initial steps preceding the mapping phase. We highlight the importance of a repeatable Extract-Transform-Load (ETL) pipeline for health data extraction from HL7 CDA documents in Estonia for research purposes.
Methods: We developed a repeatable ETL pipeline to facilitate the extraction, cleaning, and restructuring of health data from CDA documents to OMOP CDM, ensuring a high-quality and structured data format.
J Am Med Inform Assoc
December 2024
Department of Radiology and Medical Informatics, University of Geneva, 1202 Geneva, Switzerland.
Objectives: Clinical trials (CTs) are essential for improving patient care by evaluating new treatments' safety and efficacy. A key component in CT protocols is the study population defined by the eligibility criteria. This study aims to evaluate the effectiveness of large language models (LLMs) in encoding eligibility criterion information to support CT-protocol design.
View Article and Find Full Text PDFNucleic Acids Res
November 2024
School of Medicine, The Chinese University of Hong Kong, Shenzhen, Guangdong 518172, P.R. China.
MicroRNAs (miRNAs) are small non-coding RNAs (18-26 nucleotides) that regulate gene expression by interacting with target mRNAs, affecting various physiological and pathological processes. miRTarBase, a database of experimentally validated miRNA-target interactions (MTIs), now features over 3 817 550 validated MTIs from 13 690 articles, significantly expanding its previous version. The updated database includes miRNA interactions with therapeutic agents, revealing roles in drug resistance and therapeutic strategies.
View Article and Find Full Text PDFStud Health Technol Inform
November 2024
Bern University of Applied Sciences, Switzerland.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!