AI Article Synopsis

  • The study aimed to develop and validate a deep learning model to extract symptoms from clinical notes in electronic health records, which are important for cancer research and monitoring.
  • A total of 1,225 outpatient progress notes were analyzed, with 1,125 used for training the model and 100 for testing its accuracy, focusing on detecting 80 symptoms recognized by the National Cancer Institute.
  • The best-performing model, called ELECTRA-small, achieved a high accuracy in identifying symptoms, suggesting that deep learning can effectively support system-wide tracking of symptoms in cancer care.

Article Abstract

Purpose: Symptoms are vital outcomes for cancer clinical trials, observational research, and population-level surveillance. Patient-reported outcomes (PROs) are valuable for monitoring symptoms, yet there are many challenges to collecting PROs at scale. We sought to develop, test, and externally validate a deep learning model to extract symptoms from unstructured clinical notes in the electronic health record.

Methods: We randomly selected 1,225 outpatient progress notes from among patients treated at the Dana-Farber Cancer Institute between January 2016 and December 2019 and used 1,125 notes as our training/validation data set and 100 notes as our test data set. We evaluated the performance of 10 deep learning models for detecting 80 symptoms included in the National Cancer Institute's Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE) framework. Model performance as compared with manual chart abstraction was assessed using standard metrics, and the highest performer was externally validated on a sample of 100 physician notes from a different clinical context.

Results: In our training and test data sets, 75 of the 80 candidate symptoms were identified. The ELECTRA-small model had the highest performance for symptom identification at the token level (ie, at the individual symptom level), with an F1 of 0.87 and a processing time of 3.95 seconds per note. For the 10 most common symptoms in the test data set, the F1 score ranged from 0.98 for anxious to 0.86 for fatigue. For external validation of the same symptoms, the note-level performance ranged from F1 = 0.97 for diarrhea and dizziness to F1 = 0.73 for swelling.

Conclusion: Training a deep learning model to identify a wide range of electronic health record-documented symptoms relevant to cancer care is feasible. This approach could be used at the health system scale to complement to electronic PROs.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9232368PMC
http://dx.doi.org/10.1200/CCI.21.00136DOI Listing

Publication Analysis

Top Keywords

deep learning
16
electronic health
12
data set
12
test data
12
symptoms
9
unstructured clinical
8
clinical notes
8
patient-reported outcomes
8
learning model
8
notes
6

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!