Deep Learning Systems for Pneumothorax Detection on Chest Radiographs: A Multicenter External Validation Study.

Radiol Artif Intell

Department of Diagnostic Imaging, National University Hospital, 5 Lower Kent Ridge Rd, Singapore 119074 (Y.L.T., D.N., J.T.P.D.H., P.J., S.Y.S., V.T.Y.T., S.T.Q.); Saw Swee Hock School of Public Health, School of Computer Science, and Yong Loo Lin School of Medicine, National University of Singapore, Singapore (D.N., M.F.); Department of Diagnostic Radiology, Alexandra Hospital, Singapore (J.T.P.D.H.); Department of Diagnostic Radiology, Tan Tock Seng Hospital, Singapore (C.H.T., Y.H.T.); Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore (C.H.T.); Department of Diagnostic Radiology, Ng Teng Fong General Hospital, Singapore (P.L.K.); and Department of Diagnostic Radiology, Khoo Teck Puat Hospital, Singapore (G.G.P.).

Published: July 2021

AI Article Synopsis

  • * The model was trained on two large datasets and tested across six external datasets, achieving high accuracy (AUC scores ranging from 0.91 to 0.98) in detecting pneumothorax compared to a 0.93 AUC in internal testing.
  • * The results indicate that the model performs better in identifying larger pneumothoraces compared to smaller ones, and the presence or absence of a chest tube on radiographs does not significantly affect detection accuracy.

Article Abstract

Purpose: To assess the generalizability of a deep learning pneumothorax detection model on datasets from multiple external institutions and examine patient and acquisition factors that might influence performance.

Materials And Methods: In this retrospective study, a deep learning model was trained for pneumothorax detection by merging two large open-source chest radiograph datasets: ChestX-ray14 and CheXpert. It was then tested on six external datasets from multiple independent institutions (labeled A-F) in a retrospective case-control design (data acquired between 2016 and 2019 from institutions A-E; institution F consisted of data from the MIMIC-CXR dataset). Performance on each dataset was evaluated by using area under the receiver operating characteristic curve (AUC) analysis, sensitivity, specificity, and positive and negative predictive values, with two radiologists in consensus being used as the reference standard. Patient and acquisition factors that influenced performance were analyzed.

Results: The AUCs for pneumothorax detection for external institutions A-F were 0.91 (95% CI: 0.88, 0.94), 0.97 (95% CI: 0.94, 0.99), 0.91 (95% CI: 0.85, 0.97), 0.98 (95% CI: 0.96, 1.0), 0.97 (95% CI: 0.95, 0.99), and 0.92 (95% CI: 0.90, 0.95), respectively, compared with the internal test AUC of 0.93 (95% CI: 0.92, 0.93). The model had lower performance for small compared with large pneumothoraces (AUC, 0.88 [95% CI: 0.85, 0.91] vs AUC, 0.96 [95% CI: 0.95, 0.97]; = .005). Model performance was not different when a chest tube was present or absent on the radiographs (AUC, 0.95 [95% CI: 0.92, 0.97] vs AUC, 0.94 [95% CI: 0.92, 0.05]; > .99).

Conclusion: A deep learning model trained with a large volume of data on the task of pneumothorax detection was able to generalize well to multiple external datasets with patient demographics and technical parameters independent of the training data. Thorax, Computer Applications-Detection/DiagnosisSee also commentary by Jacobson and Krupinski in this issue.©RSNA, 2021.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8328109PMC
http://dx.doi.org/10.1148/ryai.2021200190DOI Listing

Publication Analysis

Top Keywords

pneumothorax detection
20
deep learning
16
datasets multiple
8
multiple external
8
external institutions
8
patient acquisition
8
acquisition factors
8
learning model
8
model trained
8
external datasets
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!