Purpose: Although immune checkpoint inhibitors (ICIs) have improved outcomes in certain patients with cancer, they can also cause life-threatening immunotoxicities. Predicting immunotoxicity risks alongside response could provide a personalized risk-benefit profile, inform therapeutic decision making, and improve clinical trial cohort selection. We aimed to build a machine learning (ML) framework using routine electronic health record (EHR) data to predict hepatitis, colitis, pneumonitis, and 1-year overall survival.

Methods: Real-world EHR data of more than 2,200 patients treated with ICI through December 31, 2018, were used to develop predictive models. Using a prediction time point of ICI initiation, a 1-year prediction time window was applied to create binary labels for the four outcomes for each patient. Feature engineering involved aggregating laboratory measurements over appropriate time windows (60-365 days). Patients were randomly partitioned into training (80%) and test (20%) sets. Random forest classifiers were developed using a rigorous model development framework.

Results: The patient cohort had a median age of 63 years and was 61.8% male. Patients predominantly had melanoma (37.8%), lung cancer (27.3%), or genitourinary cancer (16.4%). They were treated with PD-1 (60.4%), PD-L1 (9.0%), and CTLA-4 (19.7%) ICIs. Our models demonstrate reasonably strong performance, with AUCs of 0.739, 0.729, 0.755, and 0.752 for the pneumonitis, hepatitis, colitis, and 1-year overall survival models, respectively. Each model relies on an outcome-specific feature set, though some features are shared among models.

Conclusion: To our knowledge, this is the first ML solution that assesses individual ICI risk-benefit profiles based predominantly on routine structured EHR data. As such, use of our ML solution will not require additional data collection or documentation in the clinic.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10919473PMC
http://dx.doi.org/10.1200/CCI.23.00207DOI Listing

Publication Analysis

Top Keywords

ehr data
12
immune checkpoint
8
checkpoint inhibitors
8
hepatitis colitis
8
prediction time
8
data
5
prediction effectiveness
4
effectiveness toxicities
4
toxicities immune
4
inhibitors real-world
4

Similar Publications

Every Day I Write the Book: Transforming EHR Data from Daily Practice into New Knowledge.

J Am Board Fam Med

December 2024

From The Center for Professionalism & Value in Health Care, American Board of Family Medicine, Washington, DC (RLP); Research & Policy, American Board of Family Medicine, Washington, DC; and Center for Professionalism & Value in Health Care Washington, DC (AWB).

Despite producing mountains of data in the daily course of care, the documentation labors of frontline clinicians currently return very little value to them or to the health system. The potential of these painstakingly collected data are enormous and clinical registries can extract the extraordinary capacity of these data and transform them into research-ready datasets while protecting the confidentiality of the patients and clinicians. Clinical registries represent transformative tools for primary care research, bringing together the dimensions of clinical practice, research, quality improvement, and policy impact from a large, nationally reflective, diverse sample of practices and patients.

View Article and Find Full Text PDF

Interpersonal Continuity of Care May Help Delay Progression to Type 2 Diabetes.

J Am Board Fam Med

December 2024

From the Department of Population Health Sciences, Geisinger, Danville, PA (BLJ, LBD); Center for Obesity and Metabolic Health, Danville, PA (GCW, AC, CDS, LBD); Department of Health Services Research, Management and Policy, University of Florida, Health Sciences Center, Gainesville, FL (AGM); Geisinger Department of Genomic Health, Danville, PA 17822 (AKR); Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD (AKR).

Background: The association between interpersonal continuity of care (CoC) and progression from the prediabetic state to Type 2 Diabetes (T2D) remains unknown.

Aim: To evaluate the association between interpersonal CoC and the progression to T2D among persons with prediabetes.

Design And Setting: A retrospective cohort study using electronic health record (EHR) data from 6620 patients at Geisinger, a large rural health care system in Danville, PA.

View Article and Find Full Text PDF

Objective: To detect and classify features of stigmatizing and biased language in intensive care electronic health records (EHRs) using natural language processing techniques.

Materials And Methods: We first created a lexicon and regular expression lists from literature-driven stem words for linguistic features of stigmatizing patient labels, doubt markers, and scare quotes within EHRs. The lexicon was further extended using Word2Vec and GPT 3.

View Article and Find Full Text PDF

Background And Aims: Observational healthcare data are an important tool for delineating patients' inflammatory bowel disease (IBD) journey in real-world settings. However, studies that characterize IBD cohorts typically rely on a single resource, apply diverse eligibility criteria, and extract variable sets of attributes, making comparison between cohorts challenging. We aim to longitudinally describe and compare IBD patient cohorts across multiple geographic regions, employing unified data and analysis framework.

View Article and Find Full Text PDF

Enhancing patient representation learning with inferred family pedigrees improves disease risk prediction.

J Am Med Inform Assoc

December 2024

Statistical Modeling, Global Computational Biology and Digital Sciences, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach an der Riβ 88400, Germany.

Background: Machine learning and deep learning are powerful tools for analyzing electronic health records (EHRs) in healthcare research. Although family health history has been recognized as a major predictor for a wide spectrum of diseases, research has so far adopted a limited view of family relations, essentially treating patients as independent samples in the analysis.

Methods: To address this gap, we present ALIGATEHR, which models inferred family relations in a graph attention network augmented with an attention-based medical ontology representation, thus accounting for the complex influence of genetics, shared environmental exposures, and disease dependencies.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!