Machine learning algorithms are increasingly used in the clinical literature, claiming advantages over logistic regression. However, they are generally designed to maximize the area under the receiver operating characteristic curve. While area under the receiver operating characteristic curve and other measures of accuracy are commonly reported for evaluating binary prediction problems, these metrics can be misleading. We aim to give clinical and machine learning researchers a realistic medical example of the dangers of relying on a single measure of discriminatory performance to evaluate binary prediction questions. Prediction of medical complications after surgery is a frequent but challenging task because many post-surgery outcomes are rare. We predicted post-surgery mortality among patients in a clinical registry who received at least one aortic valve replacement. Estimation incorporated multiple evaluation metrics and algorithms typically regarded as performing well with rare outcomes, as well as an ensemble and a new extension of the lasso for multiple unordered treatments. Results demonstrated high accuracy for all algorithms with moderate measures of cross-validated area under the receiver operating characteristic curve. False positive rates were 1%, however, true positive rates were 7%, even when paired with a 100% positive predictive value, and graphical representations of calibration were poor. Similar results were seen in simulations, with the addition of high area under the receiver operating characteristic curve (90%) accompanying low true positive rates. Clinical studies should not primarily report only area under the receiver operating characteristic curve or accuracy.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8561661 | PMC |
http://dx.doi.org/10.1177/09622802211038754 | DOI Listing |
Sci Rep
January 2025
Department of Respiratory and Critical Care Medicine, Changhai Hospital, The Second Military Medical University, Shanghai, People's Republic of China.
In recent years, large amounts of researches showed that pulmonary embolism (PE) has become a common disease, and PE remains a clinical challenge because of its high mortality, high disability, high missed and high misdiagnosed rates. To address this, we employed an artificial intelligence-based machine learning algorithm (MLA) to construct a robust predictive model for PE. We retrospectively analyzed 1480 suspected PE patients hospitalized in West China Hospital of Sichuan University between May 2015 and April 2020.
View Article and Find Full Text PDFJ Cardiothorac Vasc Anesth
December 2024
Department of Anesthesiology, Dokkyo Medical University Saitama Medical Center Saitama, Japan.
Objectives: To assess whether the Quantra-Qplus can provide the cutoff values for predicting transfusion thresholds after cardiopulmonary bypass.
Design: Prospective observational study.
Setting: Single-center university hospital.
J Biomed Inform
January 2025
Harvard Medical School, Boston, MA, USA; VA Boston Healthcare System, Boston, MA, USA; Harvard T.H. Chan School of Public Health, Boston, MA, USA. Electronic address:
Motivation: The increasing availability of electronic health record (EHR) systems has created enormous potential for translational research. Recent developments in representation learning techniques have led to effective large-scale representations of EHR concepts along with knowledge graphs that empower downstream EHR studies. However, most existing methods require training with patient-level data, limiting their abilities to expand the training with multi-institutional EHR data.
View Article and Find Full Text PDFAdv Med Sci
January 2025
Shulan (Hangzhou) Hospital Affiliated to Zhejiang Shuren University Shulan International Medical College, Hangzhou, Zhejiang, P.R. China. Electronic address:
Purpose: This study aims to evaluate the prognostic value of contrast-enhanced ultrasound (CEUS) combined with tumour markers in patients with hepatocellular carcinoma (HCC) undergoing microwave ablation (MWA).
Methods: MWA patients with HCC were divided into good prognosis (n = 75) and poor prognosis (n = 69) groups. The levels of alpha-fetoprotein (AFP), carbohydrate antigen (CA19-9), and carcinoembryonic antigen (CEA) before and after MWA were analysed using an independent sample t-test.
Surgery
January 2025
Department of Surgery, Severance Hospital, Yonsei University College of Medicine, Seoul, Korea. Electronic address:
Background/aims: Early warning scores are simple scores obtained by measuring physiological parameters and have been regarded as useful tools for detecting clinical deterioration. This study aimed to evaluate the impact of early warning scores in predicting in-hospital mortality in critically ill patients readmitted to the surgical intensive care unit.
Methods: The study was conducted at a tertiary referral teaching hospital in South Korea.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!