Background And Aims: Ensemble machine learning (ML) methods can combine many individual models into a single 'super' model using an optimal weighted combination. Here we demonstrate how an underutilized ensemble model, the superlearner, can be used as a benchmark for model performance in clinical risk prediction. We illustrate this by implementing a superlearner to predict liver fibrosis in patients with non-alcoholic fatty liver disease (NAFLD).

Methods: We trained a superlearner based on 23 demographic and clinical variables, with the goal of predicting stage 2 or higher liver fibrosis. The superlearner was trained on data from the Non-alcoholic steatohepatitis - clinical research network observational study (NASH-CRN, n=648), and validated using data from participants in a randomized trial for NASH ('FLINT' trial, n=270) and data from examinees with NAFLD who participated in the National Health and Nutrition Examination Survey (NHANES, n=1244). We compared the performance of the superlearner with existing models, including FIB-4, NFS, Forns, APRI, BARD and SAFE.

Results: In the FLINT and NHANES validation sets, the superlearner (derived from 12 base models) discriminates patients with significant fibrosis from those without well, with AUCs of 0.79 (95% CI: 0.73-0.84) and 0.74 (95% CI: 0.68-0.79). Among the existing scores considered, the SAFE score performed similarly to the superlearner, and the superlearner and SAFE scores outperformed FIB-4, APRI, Forns, and BARD scores in the validation datasets. A superlearner model derived from 12 base models performed as well as one derived from 90 base models.

Conclusions: The superlearner, thought of as the "best-in-class" ML prediction, performed better than most existing models commonly used in practice in detecting fibrotic NASH. The superlearner can be used to benchmark the performance of conventional clinical risk prediction models.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10418571PMC
http://dx.doi.org/10.1101/2023.08.02.23293569DOI Listing

Publication Analysis

Top Keywords

clinical risk
12
risk prediction
12
superlearner
12
liver fibrosis
12
derived base
12
ensemble machine
8
machine learning
8
non-alcoholic fatty
8
fatty liver
8
liver disease
8

Similar Publications

The implication of pericardial effusion in the third trimester for preeclampsia and heart failure in high-risk pregnant women.

J Echocardiogr

January 2025

Division of Cardiology, Department of Internal Medicine, Keimyung University Dongsan Medical Center, Keimyung University School of Medicine, 1035 Dalgubeol-Daero, Dalseo-Gu, Daegu, 42601, Republic of Korea.

Background: With the growing number of high-risk pregnant women, echocardiography frequently reveals pericardial effusion (PE). However, the clinical implications of PE are unknown.

Method: We analyzed a cohort of 406 high-risk pregnant women who underwent echocardiography in the third trimester between November 2019 and December 2022.

View Article and Find Full Text PDF

Comprehensive histopathological analysis of gastric cancer in European and Latin America populations reveals differences in PDL1, HER2, p53 and MUC6 expression.

Gastric Cancer

January 2025

Department of Medical Oncology, Hospital Clinico Universitario, INCLIVA, Biomedical Research Institute, University of Valencia, Avenida Menendez Pelayo nro 4 accesorio, Valencia, Spain.

Introduction: Gastric cancer (GC) burden is currently evolving with regional differences associated with complex behavioural, environmental, and genetic risk factors. The LEGACy study is a Horizon 2020-funded multi-institutional research project conducted prospectively to provide comprehensive data on the tumour biological characteristics of gastroesophageal cancer from European and LATAM countries.

Material And Methods: Treatment-naïve advanced gastroesophageal adenocarcinoma patients were prospectively recruited in seven European and LATAM countries.

View Article and Find Full Text PDF

Background/purpose: Although metabolic dysfunction-associated steatotic liver disease (MASLD) has been proposed to replace the diagnosis of non-alcoholic fatty liver disease (NAFLD) with new diagnostic criteria since 2023, the genetic predisposition of MASLD remains to be explored.

Methods: Participants with data of genome-wide association studies (GWAS) in the Taiwan Biobank database were collected. Patients with missing data, positive for HBsAg, anti-HCV, and alcohol drinking history were excluded.

View Article and Find Full Text PDF

A common digestive system cancer with a dismal prognosis and a high death rate globally is breast cancer (BRCA). BRCA recurrence, metastasis, and medication resistance are all significantly impacted by cancer stem cells (CSCs). However, the relationship between CSCs and the tumor microenvironment in BRCA individuals remains unknown, and this information is critically needed.

View Article and Find Full Text PDF

Purpose: There is an increasing incidence of young breast cancer (YBC) patients with uncertainty surrounding the factors and patterns that are contributing.

Methods: We obtained characteristics and survival data from 206,156 YBC patients (≤ 40 years of age) diagnosed between 2005 and 2019 from the National Cancer Database (NCDB). Patients were subdivided into two comparison groups based on year of diagnosis (2005-2009, Old vs.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!