Deep learning prediction of sex on chest radiographs: a potential contributor to biased algorithms.

Emerg Radiol

University of Maryland Medical Intelligent Imaging (UM2II) Center, Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, 670 W Baltimore St, Room 1172, Baltimore, MD, 21201, USA.

Published: April 2022

Background: Deep convolutional neural networks (DCNNs) for diagnosis of disease on chest radiographs (CXR) have been shown to be biased against males or females if the datasets used to train them have unbalanced sex representation. Prior work has suggested that DCNNs can predict sex on CXR, which could aid forensic evaluations, but also be a source of bias.

Objective: To (1) evaluate the performance of DCNNs for predicting sex across different datasets and architectures and (2) evaluate visual biomarkers used by DCNNs to predict sex on CXRs.

Materials And Methods: Chest radiographs were obtained from the Stanford CheXPert and NIH Chest XRay14 datasets which comprised of 224,316 and 112,120 CXRs, respectively. To control for dataset size and class imbalance, random undersampling was used to reduce each dataset to 97,560 images that were balanced for sex. Each dataset was randomly split into training (70%), validation (10%), and test (20%) sets. Four DCNN architectures pre-trained on ImageNet were used for transfer learning. DCNNs were externally validated using a test set from the opposing dataset. Performance was evaluated using area under the receiver operating characteristic curve (AUC). Class activation mapping (CAM) was used to generate heatmaps visualizing the regions contributing to the DCNN's prediction.

Results: On the internal test set, DCNNs achieved AUROCs ranging from 0.98 to 0.99. On external validation, the models reached peak cross-dataset performance of 0.94 for the VGG19-Stanford model and 0.95 for the InceptionV3-NIH model. Heatmaps highlighted similar regions of attention between model architectures and datasets, localizing to the mediastinal and upper rib regions, as well as to the lower chest/diaphragmatic regions.

Conclusion: DCNNs trained on two large CXR datasets accurately predicted sex on internal and external test data with similar heatmap localizations across DCNN architectures and datasets. These findings support the notion that DCNNs can leverage imaging biomarkers to predict sex and potentially confound the accurate prediction of disease on CXRs and contribute to biased models. On the other hand, these DCNNs can be beneficial to emergency radiologists for forensic evaluations and identifying patient sex for patients whose identities are unknown, such as in acute trauma.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s10140-022-02019-3DOI Listing

Publication Analysis

Top Keywords

chest radiographs
12
predict sex
12
sex
9
dcnns
9
dcnns predict
8
forensic evaluations
8
dcnn architectures
8
test set
8
architectures datasets
8
datasets
6

Similar Publications

Objective: To identify risk factors for clinically-important drowning-associated lung injury (ciDALI) in children.

Study Design: This was a cross-sectional study of children (0 through18 years) who presented to 32 pediatric emergency departments (EDs) from 2010 through 2017. We reviewed demographics, comorbidities, prehospital data, chest radiographs reports, and ED course from emergency medical services, medical, and fatality records.

View Article and Find Full Text PDF

Background: Traditionally, pediatric pneumonia is diagnosed through clinical examination and chest radiography (CXR), with computed tomography (CT) reserved for complications. Lung ultrasound (LUS) has gained popularity due to its portability and absence of ionizing radiation. This study evaluates LUS's accuracy compared to CXR in diagnosing pneumonia in children.

View Article and Find Full Text PDF

Background: Pneumothorax is a medical emergency caused by the abnormal accumulation of air in the pleural space-the potential space between the lungs and chest wall. On 2D chest radiographs, pneumothorax occurs within the thoracic cavity and outside of the mediastinum, and we refer to this area as "lung + space." While deep learning (DL) has increasingly been utilized to segment pneumothorax lesions in chest radiographs, many existing DL models employ an end-to-end approach.

View Article and Find Full Text PDF

In this paper, we present the significant results from the Covid Radiographic imaging System based on AI (Co.R.S.

View Article and Find Full Text PDF

Purpose: This study aimed to investigate the impact of different model retraining schemes and data partitioning on model performance in the task of COVID-19 classification on standard chest radiographs (CXRs), in the context of model generalizability.

Approach: Two datasets from the same institution were used: Set A (9860 patients, collected from 02/20/2020 to 02/03/2021) and Set B (5893 patients, collected from 03/15/2020 to 01/01/2022). An original deep learning (DL) model trained and tested in the task of COVID-19 classification using the initial partition of Set A achieved an area under the curve (AUC) value of 0.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!