Prediction of Gastrointestinal Tract Cancers Using Longitudinal Electronic Health Record Data.

Cancers (Basel)

Division of Gastroenterology and Hepatology, Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109, USA.

Published: February 2023

Background: Luminal gastrointestinal (GI) tract cancers, including esophageal, gastric, small bowel, colorectal, and anal cancers, are often diagnosed at late stages. These tumors can cause gradual GI bleeding, which may be unrecognized but detectable by subtle laboratory changes. Our aim was to develop models to predict luminal GI tract cancers using laboratory studies and patient characteristics using logistic regression and random forest machine learning methods.

Methods: The study was a single-center, retrospective cohort at an academic medical center, with enrollment between 2004-2013 and with follow-up until 2018, who had at least two complete blood counts (CBCs). The primary outcome was the diagnosis of GI tract cancer. Prediction models were developed using multivariable single timepoint logistic regression, longitudinal logistic regression, and random forest machine learning.

Results: The cohort included 148,158 individuals, with 1025 GI tract cancers. For 3-year prediction of GI tract cancers, the longitudinal random forest model performed the best, with an area under the receiver operator curve (AuROC) of 0.750 (95% CI 0.729-0.771) and Brier score of 0.116, compared to the longitudinal logistic regression model, with an AuROC of 0.735 (95% CI 0.713-0.757) and Brier score of 0.205.

Conclusions: Prediction models incorporating longitudinal features of the CBC outperformed the single timepoint logistic regression models at 3-years, with a trend toward improved accuracy of prediction using a random forest machine learning model compared to a longitudinal logistic regression model.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10000707PMC
http://dx.doi.org/10.3390/cancers15051399DOI Listing

Publication Analysis

Top Keywords

logistic regression
24
tract cancers
20
random forest
16
forest machine
12
longitudinal logistic
12
gastrointestinal tract
8
cancers longitudinal
8
regression random
8
machine learning
8
prediction models
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!