Attention-based Saliency Maps Improve Interpretability of Pneumothorax Classification.

Radiol Artif Intell

Munich Institute of Biomedical Engineering and Department of Informatics, Technical University of Munich, Boltzmannstr 11, Garching b., Munich 85748, Germany (A.W., R.G., T.L.); Department of Radiology, University Hospital LMU, Munich, Germany (S.Č., N.F., B.O.S.); and Munich School of Technology in Society, Technical University of Munich, Munich, Germany (T.W.).

Published: March 2023

Purpose: To investigate the chest radiograph classification performance of vision transformers (ViTs) and interpretability of attention-based saliency maps, using the example of pneumothorax classification.

Materials And Methods: In this retrospective study, ViTs were fine-tuned for lung disease classification using four public datasets: CheXpert, Chest X-Ray 14, MIMIC CXR, and VinBigData. Saliency maps were generated using transformer multimodal explainability and gradient-weighted class activation mapping (GradCAM). Classification performance was evaluated on the Chest X-Ray 14, VinBigData, and Society for Imaging Informatics in Medicine-American College of Radiology (SIIM-ACR) Pneumothorax Segmentation datasets using the area under the receiver operating characteristic curve (AUC) analysis and compared with convolutional neural networks (CNNs). The explainability methods were evaluated with positive and negative perturbation, sensitivity-n, effective heat ratio, intra-architecture repeatability, and interarchitecture reproducibility. In the user study, three radiologists classified 160 chest radiographs with and without saliency maps for pneumothorax and rated their usefulness.

Results: ViTs had comparable chest radiograph classification AUCs compared with state-of-the-art CNNs: 0.95 (95% CI: 0.94, 0.95) versus 0.83 (95%, CI 0.83, 0.84) on Chest X-Ray 14, 0.84 (95% CI: 0.77, 0.91) versus 0.83 (95% CI: 0.76, 0.90) on VinBigData, and 0.85 (95% CI: 0.85, 0.86) versus 0.87 (95% CI: 0.87, 0.88) on SIIM-ACR. Both saliency map methods unveiled a strong bias toward pneumothorax tubes in the models. Radiologists found 47% of the attention-based and 39% of the GradCAM saliency maps useful. The attention-based methods outperformed GradCAM on all metrics.

Conclusion: ViTs performed similarly to CNNs in chest radiograph classification, and their attention-based saliency maps were more useful to radiologists and outperformed GradCAM. Conventional Radiography, Thorax, Diagnosis, Supervised Learning, Convolutional Neural Network (CNN) © RSNA, 2023.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10077084PMC
http://dx.doi.org/10.1148/ryai.220187DOI Listing

Publication Analysis

Top Keywords

saliency maps
24
attention-based saliency
12
chest radiograph
12
radiograph classification
12
chest x-ray
12
classification performance
8
convolutional neural
8
versus 083
8
083 95%
8
outperformed gradcam
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!