Impact of Transfer Learning Using Local Data on Performance of a Deep Learning Model for Screening Mammography.

James J J Condon Vincent Trinh Kelly A Hall Michelle Reintals Andrew S Holmes Lauren Oakden-Rayner Lyle J Palmer

Radiol Artif Intell

From the Australian Institute for Machine Learning (J.J.J.C., V.T., L.O.R., L.J.P.) and School of Public Health (J.J.J.C., V.T., K.A.H., L.O.R., L.J.P.), University of Adelaide, N Terrace, Adelaide, South Australia 5005, Australia; and BreastScreen SA, Adelaide, South Australia, Australia (M.R., A.S.H.).

Published: July 2024

Purpose To investigate the issues of generalizability and replication of deep learning models by assessing performance of a screening mammography deep learning system developed at New York University (NYU) on a local Australian dataset. Materials and Methods In this retrospective study, all individuals with biopsy or surgical pathology-proven lesions and age-matched controls were identified from a South Australian public mammography screening program (January 2010 to December 2016). The primary outcome was deep learning system performance-measured with area under the receiver operating characteristic curve (AUC)-in classifying invasive breast cancer or ductal carcinoma in situ ( = 425) versus no malignancy ( = 490) or benign lesions ( = 44). The NYU system, including models without (NYU1) and with (NYU2) heatmaps, was tested in its original form, after training from scratch (without transfer learning), and after retraining with transfer learning. Results The local test set comprised 959 individuals (mean age, 62.5 years ± 8.5 [SD]; all female). The original AUCs for the NYU1 and NYU2 models were 0.83 (95% CI: 0.82, 0.84) and 0.89 (95% CI: 0.88, 0.89), respectively. When NYU1 and NYU2 were applied in their original form to the local test set, the AUCs were 0.76 (95% CI: 0.73, 0.79) and 0.84 (95% CI: 0.82, 0.87), respectively. After local training without transfer learning, the AUCs were 0.66 (95% CI: 0.62, 0.69) and 0.86 (95% CI: 0.84, 0.88). After retraining with transfer learning, the AUCs were 0.82 (95% CI: 0.80, 0.85) and 0.86 (95% CI: 0.84, 0.88). Conclusion A deep learning system developed using a U.S. dataset showed reduced performance when applied "out of the box" to an Australian dataset. Local retraining with transfer learning using available model weights improved model performance. Screening Mammography, Convolutional Neural Network (CNN), Deep Learning Algorithms, Breast Cancer © RSNA, 2024 See also commentary by Cadrin-Chênevert in this issue.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11294949	PMC
http://dx.doi.org/10.1148/ryai.230383	DOI Listing

Publication Analysis

Top Keywords

transfer learning

deep learning

learning

screening mammography

learning system

nyu1 nyu2

retraining transfer

learning local

learning model

performance screening

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!