Artificial intelligence (AI) algorithms evaluating [supine] chest radiographs ([S]CXRs) have remarkably increased in number recently. Since training and validation are often performed on subsets of the same overall dataset, external validation is mandatory to reproduce results and reveal potential training errors. We applied a multicohort benchmarking to the publicly accessible (S)CXR analyzing AI algorithm CheXNet, comprising three clinically relevant study cohorts which differ in patient positioning ([S]CXRs), the applied reference standards (CT-/[S]CXR-based) and the possibility to also compare algorithm classification with different medical experts' reading performance. The study cohorts include [1] a cohort, characterized by 563 CXRs acquired in the emergency unit that were evaluated by 9 readers (radiologists and non-radiologists) in terms of 4 common pathologies, [2] a collection of 6,248 SCXRs annotated by radiologists in terms of pneumothorax presence, its size and presence of inserted thoracic tube material which allowed for subgroup and confounding bias analysis and [3] a cohort consisting of 166 patients with SCXRs that were evaluated by radiologists for underlying causes of basal lung opacities, all of those cases having been correlated to a timely acquired computed tomography scan (SCXR and CT within < 90 min). CheXNet non-significantly exceeded the radiology resident (RR) consensus in the detection of suspicious lung nodules (cohort [1], AUC AI/RR: 0.851/0.839, p = 0.793) and the radiological readers in the detection of basal pneumonia (cohort [3], AUC AI/reader consensus: 0.825/0.782, p = 0.390) and basal pleural effusion (cohort [3], AUC AI/reader consensus: 0.762/0.710, p = 0.336) in SCXR, partly with AUC values higher than originally published ("Nodule": 0.780, "Infiltration": 0.735, "Effusion": 0.864). The classifier "Infiltration" turned out to be very dependent on patient positioning (best in CXR, worst in SCXR). The pneumothorax SCXR cohort [2] revealed poor algorithm performance in CXRs without inserted thoracic material and in the detection of small pneumothoraces, which can be explained by a known systematic confounding error in the algorithm training process. The benefit of clinically relevant external validation is demonstrated by the differences in algorithm performance as compared to the original publication. Our multi-cohort benchmarking finally enables the consideration of confounders, different reference standards and patient positioning as well as the AI performance comparison with differentially qualified medical readers.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9329327PMC
http://dx.doi.org/10.1038/s41598-022-16514-7DOI Listing

Publication Analysis

Top Keywords

external validation
8
artificial intelligence
8
study cohorts
8
clinically focused
4
focused multi-cohort
4
multi-cohort benchmarking
4
benchmarking tool
4
tool external
4
validation artificial
4
intelligence algorithm
4

Similar Publications

Background: We sought to define whether and how hepatic ischemia/reperfusion (I/R) as manifested by perioperative aspartate aminotransferase (AST) and alanine aminotransaminase (ALT) levels impact long-term outcomes after curative-intent resection of hepatocellular carcinoma (HCC).

Patients And Methods: Intrasplenic injection of HCC cells was used to establish a murine model of HCC recurrence with versus without I/R injury. Patients who underwent curative resection for HCC were identified from a multi-institutional derivative cohort (DC) and separate external validation (VC) cohort.

View Article and Find Full Text PDF

Hospital-acquired infections (HAIs) are serious complication for patients with acute ischemic stroke (AIS), often resulting in poor functional outcomes. However, no existing model can specifically predict HAI in AIS patients. Therefore, we employed the Gradient Boosting matching learning algorithm to establish predictive models for HAI occurrence in AIS patients and poor 30-day functional outcomes (modified Rankin Scale > 2) in AIS patients with HAI by analyzing electronic health records from 6560 AIS patients.

View Article and Find Full Text PDF

Radiomics is a method that extracts many features from medical images using various algorithms. Medical nomograms are graphical representations of statistical predictive models that produce a likelihood of a clinical event for a specific individual based on biological and clinical data. The radiomic nomogram was first introduced in 2016 to study the integration of specific radiomic characteristics with clinically significant risk factors for patients with colorectal cancer lymph node metastases.

View Article and Find Full Text PDF

Intelligent vehicle trajectory tracking with an adaptive robust nonsingular fast terminal sliding mode control in complex scenarios.

Sci Rep

December 2024

School of Vehicle and Energy, Yanshan University, 438 West Hebei Avenue, Qinhuangdao, 066004, People's Republic of China.

This study presents a strategy for an intelligent vehicle trajectory tracking system that employs an adaptive robust non-singular fast terminal sliding mode control (ARNFTSMC) approach to address the challenges of uncertain nonlinear dynamics. Initially, a path tracking error system based on mapping error is established, along with a speed tracking error system. Subsequently, a novel ARNFTSMC strategy is introduced to tackle the uncertainties and external perturbations encountered during actual vehicle operation.

View Article and Find Full Text PDF

To achieve high-performance trajectory tracking for a manipulator, this study proposes a novel sliding mode control strategy incorporating a nonlinear disturbance observer. The observer is designed to estimate unknown models in real-time, enabling feedforward compensation for various uncertainties such as modeling errors, joint friction, and external torque disturbances. The control law is formulated by integrating the Backstepping method, Lyapunov theory, and global fast terminal sliding mode theory, ensuring global convergence to zero within finite time and enhancing system robustness.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!