Logistic regression and machine learning predicted patient mortality from large sets of diagnosis codes comparably.

J Clin Epidemiol

Department of Health Services Research and Policy, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK; Clinical Effectiveness Unit, Royal College of Surgeons of England, Lincoln's Inn Fields, London WC2A 3PE, UK.

Published: May 2021

Objective: The objective of the study was to compare the performance of logistic regression and boosted trees for predicting patient mortality from large sets of diagnosis codes in electronic healthcare records.

Study Design And Setting: We analyzed national hospital records and official death records for patients with myocardial infarction (n = 200,119), hip fracture (n = 169,646), or colorectal cancer surgery (n = 56,515) in England in 2015-2017. One-year mortality was predicted from patient age, sex, and socioeconomic status, and 202 to 257 International Classification of Diseases 10th Revision codes recorded in the preceding year or not (binary predictors). Performance measures included the c-statistic, scaled Brier score, and several measures of calibration.

Results: One-year mortality was 17.2% (34,520) after myocardial infarction, 27.2% (46,115) after hip fracture, and 9.3% (5,273) after colorectal surgery. Optimism-adjusted c-statistics for the logistic regression models were 0.884 (95% confidence interval [CI]: 0.882, 0.886), 0.798 (0.796, 0.800), and 0.811 (0.805, 0.817). The equivalent c-statistics for the boosted tree models were 0.891 (95% CI: 0.889, 0.892), 0.804 (0.802, 0.806), and 0.803 (0.797, 0.809). Model performance was also similar when measured using scaled Brier scores. All models were well calibrated overall.

Conclusion: In large datasets of electronic healthcare records, logistic regression and boosted tree models of numerous diagnosis codes predicted patient mortality comparably.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jclinepi.2020.12.018DOI Listing

Publication Analysis

Top Keywords

logistic regression
16
predicted patient
12
patient mortality
12
diagnosis codes
12
mortality large
8
large sets
8
sets diagnosis
8
regression boosted
8
electronic healthcare
8
myocardial infarction
8

Similar Publications

Purpose: To evaluate the risk factors that may delay enhanced recovery in the ablation of liver tumors.

Methods: A total of 310 patients who underwent ultrasound-guided ablation of liver tumors under general anesthesia were prospectively enrolled. Baseline data, intraoperative parameters, and postoperative events were evaluated.

View Article and Find Full Text PDF

Importance: Medicare Advantage (MA) plans are designed to incentivize the use of less expensive drugs through capitated payments, formulary control, and preauthorizations for certain drugs. These conditions may reduce spending on high-cost therapies for conditions such as cancer, a condition that is among the most expensive to treat.

Objective: To determine whether patients insured by MA plans receive less high-cost drugs than those insured by traditional Medicare (TM).

View Article and Find Full Text PDF

Importance: Data characterizing the severity and changing prevalence of bone mineral density (BMD) deficits and associated nonfracture consequences among childhood cancer survivors decades after treatment are lacking.

Objective: To evaluate risk for moderate and severe BMD deficits in survivors and to identify long-term consequences of BMD deficits.

Design, Setting, And Participants: This cohort study used cross-sectional and longitudinal data from the St Jude Lifetime (SJLIFE) cohort, a retrospectively constructed cohort with prospective follow-up.

View Article and Find Full Text PDF

Radon Exposure and Gestational Diabetes.

JAMA Netw Open

January 2025

Department of Obstetrics and Gynecology, Vagelos College of Physicians and Surgeons, Columbia University Irving Medical Center, New York, New York.

Importance: Understanding environmental risk factors for gestational diabetes (GD) is crucial for developing preventive strategies and improving pregnancy outcomes.

Objective: To examine the association of county-level radon exposure with GD risk in pregnant individuals.

Design, Setting, And Participants: This multicenter, population-based cohort study used data from the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-Be (nuMoM2b) cohort, which recruited nulliparous pregnant participants from 8 US clinical centers between October 2010 and September 2013.

View Article and Find Full Text PDF

Background: Neoadjuvant chemotherapy is standard for advanced esophageal squamous cell carcinoma, though often ineffective. Therefore, predicting the response to chemotherapy before treatment is desirable. However, there is currently no established method for predicting response to neoadjuvant chemotherapy.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!