Evaluation of stacked ensemble model performance to predict clinical outcomes: A COVID-19 study.

Int J Med Inform

Department of Bioengineering, University of Louisville, Louisville, KY, USA; James Graham Brown Cancer Center, University of Louisville, Louisville, KY, USA; Center for Predictive Medicine, University of Louisville, Louisville, KY, USA. Electronic address:

Published: July 2023

Background: The application of machine learning (ML) to analyze clinical data with the goal to predict patient outcomes has garnered increasing attention. Ensemble learning has been used in conjunction with ML to improve predictive performance. Although stacked generalization (stacking), a type of heterogeneous ensemble of ML models, has emerged in clinical data analysis, it remains unclear how to define the best model combinations for strong predictive performance. This study develops a methodology to evaluate the performance of "base" learner models and their optimized combination using "meta" learner models in stacked ensembles to accurately assess performance in the context of clinical outcomes.

Methods: De-identified COVID-19 data was obtained from the University of Louisville Hospital, where a retrospective chart review was performed from March 2020 to November 2021. Three differently-sized subsets using features from the overall dataset were chosen to train and evaluate ensemble classification performance. The number of base learners chosen from several algorithm families coupled with a complementary meta learner was varied from a minimum of 2 to a maximum of 8. Predictive performance of these combinations was evaluated in terms of mortality and severe cardiac event outcomes using area-under-the-receiver-operating-characteristic (AUROC), F1, balanced accuracy, and kappa.

Results: The results highlight the potential to accurately predict clinical outcomes, such as severe cardiac events with COVID-19, from routinely acquired in-hospital patient data. Meta learners Generalized Linear Model (GLM), Multi-Layer Perceptron (MLP), and Partial Least Squares (PLS) had the highest AUROC for both outcomes, while K-Nearest Neighbors (KNN) had the lowest. Performance trended lower in the training set as the number of features increased, and exhibited less variance in both training and validation across all feature subsets as the number of base learners increased.

Conclusion: This study offers a methodology to robustly evaluate ensemble ML performance when analyzing clinical data.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10165871PMC
http://dx.doi.org/10.1016/j.ijmedinf.2023.105090DOI Listing

Publication Analysis

Top Keywords

clinical data
12
predictive performance
12
performance
9
predict clinical
8
clinical outcomes
8
learner models
8
evaluate ensemble
8
number base
8
base learners
8
severe cardiac
8

Similar Publications

Introduction: Wearables are electronic devices worn on the body to collect health data. These devices, like smartwatches and patches, use sensors to gather information on various health parameters. This review highlights current use and the potential benefit of wearable technology in patients with inflammatory bowel disease (IBD).

View Article and Find Full Text PDF

Introduction: Antibody-drug conjugates (ADCs) are a rapidly evolving class of anti-cancer drugs with a significant impact on management of hematological malignancies including diffuse large B-cell lymphoma (DLBCL). ADCs combine a cytotoxic drug (a.k.

View Article and Find Full Text PDF

Gestational diabetes mellitus (GDM) is a metabolic disorder that arises during pregnancy and heightens the risk of placental dysplasia. Ginsenoside Re (Re) may stabilize insulin and glucagon to regulate glucose levels, which may improve diabetes-associated diseases. This study aims to investigate the mechanism of Re in high glucose (HG)-induced apoptosis of trophoblasts through endoplasmic reticulum stress (ERS)-related protein CHOP/GADD153.

View Article and Find Full Text PDF

Beyond the Herald Patch: Exploring the Complex Landscape of Pityriasis Rosea.

Am J Clin Dermatol

January 2025

Dr. Phillip Frost Department of Dermatology and Cutaneous Surgery, University of Miami Miller School of Medicine, Miami, FL, 33136, USA.

Pityriasis rosea (PR) is a prevalent dermatological condition characterized by a distinctive herald patch, followed by secondary eruptions, often forming a "Christmas tree" pattern on the trunk. Despite its recognizable clinical presentation, the etiology of PR remains uncertain, with hypotheses pointing to both infectious and noninfectious origins. Human herpesviruses (HHV) 6 and 7 have been implicated, with evidence suggesting viral reactivation as a potential trigger.

View Article and Find Full Text PDF

Background: Continuous veno-venous hemodiafiltration (CVVHDF) is used in critically ill patients, but its impact on O₂ and CO₂ removal, as well as the accuracy of resting energy expenditure (REE) measurement using indirect calorimetry (IC) remains unclear. This study aims to evaluate the effects of CVVHDF on O₂ and CO₂ removal and the accuracy of REE measurement using IC in patients undergoing continuous renal replacement therapy.

Design: Prospective, observational, single-center study.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!