Context: Clinicians document cancer patients' symptoms in free-text format within electronic health record visit notes. Although symptoms are critically important to quality of life and often herald clinical status changes, computational methods to assess the trajectory of symptoms over time are woefully underdeveloped.
Objectives: To create machine learning algorithms capable of extracting patient-reported symptoms from free-text electronic health record notes.
Methods: The data set included 103,564 sentences obtained from the electronic clinical notes of 2695 breast cancer patients receiving paclitaxel-containing chemotherapy at two academic cancer centers between May 1996 and May 2015. We manually annotated 10,000 sentences and trained a conditional random field model to predict words indicating an active symptom (positive label), absence of a symptom (negative label), or no symptom at all (neutral label). Sentences labeled by human coder were divided into training, validation, and test data sets. Final model performance was determined on 20% test data unused in model development or tuning.
Results: The final model achieved precision of 0.82, 0.86, and 0.99 and recall of 0.56, 0.69, and 1.00 for positive, negative, and neutral symptom labels, respectively. The most common positive symptoms were pain, fatigue, and nausea. Machine-based labeling of 103,564 sentences took two minutes.
Conclusion: We demonstrate the potential of machine learning to gather, track, and analyze symptoms experienced by cancer patients during chemotherapy. Although our initial model requires further optimization to improve the performance, further model building may yield machine learning methods suitable to be deployed in routine clinical care, quality improvement, and research applications.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.jpainsymman.2018.02.016 | DOI Listing |
J Am Med Inform Assoc
January 2025
Kennewick, WA 99338, United States.
Objective: This study evaluates the utility of word embeddings, generated by large language models (LLMs), for medical diagnosis by comparing the semantic proximity of symptoms to their eponymic disease embedding ("eponymic condition") and the mean of all symptom embeddings associated with a disease ("ensemble mean").
Materials And Methods: Symptom data for 5 diagnostically challenging pediatric diseases-CHARGE syndrome, Cowden disease, POEMS syndrome, Rheumatic fever, and Tuberous sclerosis-were collected from PubMed. Using the Ada-002 embedding model, disease names and symptoms were translated into vector representations in a high-dimensional space.
Bioinformatics
January 2025
Institute for Computational Systems Biology, Universität Hamburg, Hamburg, 22761, Germany.
Motivation: Transcription factors (TFs) are DNA-binding proteins that regulate gene expression. Traditional methods predict a protein as a TF if the protein contains any DNA-binding domains (DBDs) of known TFs. However, this approach fails to identify a novel TF that does not contain any known DBDs.
View Article and Find Full Text PDFJMIR Perioper Med
January 2025
Department of Surgery, Indiana University School of Medicine, Indianapolis, IN, United States.
Background: Postoperative delirium (POD) is a common complication after major surgery and is associated with poor outcomes in older adults. Early identification of patients at high risk of POD can enable targeted prevention efforts. However, existing POD prediction models require inpatient data collected during the hospital stay, which delays predictions and limits scalability.
View Article and Find Full Text PDFJAMA Netw Open
January 2025
Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, Pennsylvania.
Importance: Recently, the US Food and Drug Administration gave premarketing approval to an algorithm based on its purported ability to identify individuals at genetic risk for opioid use disorder (OUD). However, the clinical utility of the candidate genetic variants included in the algorithm has not been independently demonstrated.
Objective: To assess the utility of 15 genetic variants from an algorithm intended to predict OUD risk.
Transl Vis Sci Technol
January 2025
Institute of the Electrical and Biomedical Engineering, UMIT TIROL - Private University for Health Sciences and Health Technology, Hall in Tyrol, Austria.
Purpose: To extract conjunctival bulbar redness from standardized high-resolution ocular surface photographs of a novel imaging system by implementing an image analysis pipeline.
Methods: Data from two trials (healthy; outgoing ophthalmic clinic) were collected, processed, and used to train a machine learning model for ocular surface segmentation. Various regions of interest were defined to globally and locally extract a redness biomarker based on color intensity.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!