Background: Severity of illness scores-Acute Physiology and Chronic Health Evaluation, Simplified Acute Physiology Score, and Sequential Organ Failure Assessment-are current risk stratification and mortality prediction tools used in intensive care units (ICUs) worldwide. Developers of artificial intelligence or machine learning (ML) models predictive of ICU mortality use the severity of illness scores as a reference point when reporting the performance of these computational constructs.

Objective: This study aimed to perform a literature review and meta-analysis of articles that compared binary classification ML models with the severity of illness scores that predict ICU mortality and determine which models have superior performance. This review intends to provide actionable guidance to clinicians on the performance and validity of ML models in supporting clinical decision-making compared with the severity of illness score models.

Methods: Between December 15 and 18, 2020, we conducted a systematic search of PubMed, Scopus, Embase, and IEEE databases and reviewed studies published between 2000 and 2020 that compared the performance of binary ML models predictive of ICU mortality with the performance of severity of illness score models on the same data sets. We assessed the studies' characteristics, synthesized the results, meta-analyzed the discriminative performance of the ML and severity of illness score models, and performed tests of heterogeneity within and among studies.

Results: We screened 461 abstracts, of which we assessed the full text of 66 (14.3%) articles. We included in the review 20 (4.3%) studies that developed 47 ML models based on 7 types of algorithms and compared them with 3 types of the severity of illness score models. Of the 20 studies, 4 (20%) were found to have a low risk of bias and applicability in model development, 7 (35%) performed external validation, 9 (45%) reported on calibration, 12 (60%) reported on classification measures, and 4 (20%) addressed explainability. The discriminative performance of the ML-based models, which was reported as AUROC, ranged between 0.728 and 0.99 and between 0.58 and 0.86 for the severity of illness score-based models. We noted substantial heterogeneity among the reported models and considerable variation among the AUROC estimates for both ML and severity of illness score model types.

Conclusions: ML-based models can accurately predict ICU mortality as an alternative to traditional scoring models. Although the range of performance of the ML models is superior to that of the severity of illness score models, the results cannot be generalized due to the high degree of heterogeneity. When presented with the option of choosing between severity of illness score or ML models for decision support, clinicians should select models that have been externally validated, tested in the practice environment, and updated to the patient population and practice environment.

Trial Registration: PROSPERO CRD42021203871; https://tinyurl.com/28v2nch8.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9198821PMC
http://dx.doi.org/10.2196/35293DOI Listing

Publication Analysis

Top Keywords

severity illness
48
illness score
28
score models
20
models
19
icu mortality
16
illness
12
illness scores
12
models predictive
12
severity
11
artificial intelligence
8

Similar Publications

Background: We aimed to evaluate how the parameters used in the diagnosis of metabolic syndrome (MetS) and parameters such as epicardial adipose tissue (EAT) thickness, insulin resistance (IR), and serum uric acid (SUA) are affected according to the severity of obesity.

Methods: A total of 120 obese patients aged 10-18 years were classified as class 1-2-3 according to their body mass index (BMI) score. SUA was measured and oral glucose tolerance tests were performed on all patients.

View Article and Find Full Text PDF

Background: This study aimed to investigate the risk factors associated with the severity of the disease, the need for mechanical ventilation (MV) and poor prognosis in the early stages of Guillain-Barré Syndrome (GBS).

Methods: Data of children who met GBS diagnostic criteria were evaluated retrospectively. The sample was divided into three binary subgroups according to severe GBS (Hughes Functional Grading Scale [HFGS] ≥ 4 at admission), mechanical ventilation (MV) requirement, and poor prognosis (inability to walk independently, HFGS ≥ 3 after six months).

View Article and Find Full Text PDF

The Health-Related Quality of Life scores and joint health in children and young adults with hemophilia.

Turk J Pediatr

December 2024

Division of Pediatric Hematology and Oncology, Department of Pediatrics, İstanbul Faculty of Medicine, İstanbul University, İstanbul, Türkiye.

Background: Patients with hemophilia should be evaluated for joint health and overall health in their visits. The aims of this study were to evaluate joint health and health-related quality of life (HRQoL) in patients with mild, moderate, and severe hemophilia; determine which patient groups to focus on and whether there are any neglected patient groups.

Methods: This was a single-center, cross-sectional study.

View Article and Find Full Text PDF

COVID-19 remains a significant global health problem with uncertain long-term consequences for convalescents. We investigated the relationships between anti-N protein antibody levels, severe acute respiratory syndrome (SARS)-CoV-2-associated TCR repertoire parameters, HLA type and epidemiological information from three cohorts of 524 SARS-CoV-2-infected subjects subgrouped in acute phase, seronegative and seropositive convalescents from the Emilia Romagna region. Epidemiological information and anti-N antibody index were associated with TCR repertoire data.

View Article and Find Full Text PDF

Objective: This study aimed to compare the accuracy of four neonatal illness severity scores for predicting mortality in persistent pulmonary hypertension of the newborn (PPHN).

Study Design: This retrospective study included neonates diagnosed with PPHN between 2013 and 2022. The illness severity scores of four commonly used tools were completed for each infant: the Clinical Risk Index for Babies-II (CRIB-II), the Score for Neonatal Acute Physiology-Perinatal Extension version II (SNAPPE-II) in the first 12 h after admission and maximum oxygenation index (OI) and Vasoactive-Inotropic score (VIS) during the first 24 h (OI24max and VIS24max), 48 h (OI48max and VIS48max), and 72 h (OI72max and VIS72max) after admission.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!