Mixed-methods evaluation of three natural language processing modeling approaches for measuring documented goals-of-care discussions in the electronic health record.

J Pain Symptom Manage

Department of Medicine (A.M.U., J.R.C., R.A.E., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA; Cambia Palliative Care Center of Excellence at UW Medicine (A.M.U., J.R.C., R.A.E., L.C.B., Y.G., J.S., W.B.L., T.C., J.T., J.H., S.R.P., E.K.K., R.Y.L.), University of Washington, Seattle, WA; Division of Pulmonary, Critical Care, and Sleep Medicine (J.R.C., R.A.E., J.T., J.H., S.R.P., E.K.K., R.Y.L.), Department of Medicine, Harborview Medical Center, University of Washington, Seattle, WA.

Published: June 2022

Context: Documented goals-of-care discussions are an important quality metric for patients with serious illness. Natural language processing (NLP) is a promising approach for identifying goals-of-care discussions in the electronic health record (EHR).

Objectives: To compare three NLP modeling approaches for identifying EHR documentation of goals-of-care discussions and generate hypotheses about differences in performance.

Methods: We conducted a mixed-methods study to evaluate performance and misclassification for three NLP featurization approaches modeled with regularized logistic regression: bag-of-words (BOW), rule-based, and a hybrid approach. From a prospective cohort of 150 patients hospitalized with serious illness over 2018 to 2020, we collected 4391 inpatient EHR notes; 99 (2.3%) contained documented goals-of-care discussions. We used leave-one-out cross-validation to estimate performance by comparing pooled NLP predictions to human abstractors with receiver-operating-characteristic (ROC) and precision-recall (PR) analyses. We qualitatively examined a purposive sample of 70 NLP-misclassified notes using content analysis to identify linguistic features that allowed us to generate hypotheses underpinning misclassification.

Results: All three modeling approaches discriminated between notes with and without goals-of-care discussions (AUC: BOW, 0.907; rule-based, 0.948; hybrid, 0.965). Precision and recall were only moderate (precision at 70% recall: BOW, 16.2%; rule-based, 50.4%; hybrid, 49.3%; AUC: BOW, 0.505; rule-based, 0.579; hybrid, 0.599). Qualitative analysis revealed patterns underlying performance differences between BOW and rule-based approaches.

Conclusion: NLP holds promise for identifying EHR-documented goals-of-care discussions. However, the rarity of goals-of-care content in EHR data limits performance. Our findings highlight opportunities to optimize NLP modeling approaches, and support further exploration of different NLP approaches to identify goals-of-care discussions.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9124686PMC
http://dx.doi.org/10.1016/j.jpainsymman.2022.02.006DOI Listing

Publication Analysis

Top Keywords

goals-of-care discussions
32
modeling approaches
16
documented goals-of-care
12
goals-of-care
9
natural language
8
language processing
8
discussions
8
discussions electronic
8
electronic health
8
health record
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!