Impact of Transfer Learning Using Local Data on Performance of a Deep Learning Model for Screening Mammography.

Radiol Artif Intell

From the Australian Institute for Machine Learning (J.J.J.C., V.T., L.O.R., L.J.P.) and School of Public Health (J.J.J.C., V.T., K.A.H., L.O.R., L.J.P.), University of Adelaide, N Terrace, Adelaide, South Australia 5005, Australia; and BreastScreen SA, Adelaide, South Australia, Australia (M.R., A.S.H.).

Published: July 2024

Purpose To investigate the issues of generalizability and replication of deep learning models by assessing performance of a screening mammography deep learning system developed at New York University (NYU) on a local Australian dataset. Materials and Methods In this retrospective study, all individuals with biopsy or surgical pathology-proven lesions and age-matched controls were identified from a South Australian public mammography screening program (January 2010 to December 2016). The primary outcome was deep learning system performance-measured with area under the receiver operating characteristic curve (AUC)-in classifying invasive breast cancer or ductal carcinoma in situ ( = 425) versus no malignancy ( = 490) or benign lesions ( = 44). The NYU system, including models without (NYU1) and with (NYU2) heatmaps, was tested in its original form, after training from scratch (without transfer learning), and after retraining with transfer learning. Results The local test set comprised 959 individuals (mean age, 62.5 years ± 8.5 [SD]; all female). The original AUCs for the NYU1 and NYU2 models were 0.83 (95% CI: 0.82, 0.84) and 0.89 (95% CI: 0.88, 0.89), respectively. When NYU1 and NYU2 were applied in their original form to the local test set, the AUCs were 0.76 (95% CI: 0.73, 0.79) and 0.84 (95% CI: 0.82, 0.87), respectively. After local training without transfer learning, the AUCs were 0.66 (95% CI: 0.62, 0.69) and 0.86 (95% CI: 0.84, 0.88). After retraining with transfer learning, the AUCs were 0.82 (95% CI: 0.80, 0.85) and 0.86 (95% CI: 0.84, 0.88). Conclusion A deep learning system developed using a U.S. dataset showed reduced performance when applied "out of the box" to an Australian dataset. Local retraining with transfer learning using available model weights improved model performance. Screening Mammography, Convolutional Neural Network (CNN), Deep Learning Algorithms, Breast Cancer © RSNA, 2024 See also commentary by Cadrin-Chênevert in this issue.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11294949PMC
http://dx.doi.org/10.1148/ryai.230383DOI Listing

Publication Analysis

Top Keywords

transfer learning
24
deep learning
24
learning
12
screening mammography
12
learning system
12
nyu1 nyu2
12
retraining transfer
12
learning local
8
learning model
8
performance screening
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!