Impact of Different Mammography Systems on Artificial Intelligence Performance in Breast Cancer Screening.

Radiol Artif Intell

From the Aberdeen Centre for Health Data Science, Institute of Applied Health Sciences (C.F.d.V., M.B., L.A.A.), School of Medicine, Medical Science and Nutrition (S.J.C., R.T.S.), and Grampian Data Safe Haven (DaSH), Aberdeen Centre for Health Data Science, Institute of Applied Health Sciences (J.A.D.), University of Aberdeen, Polwarth Building, Foresterhill, Aberdeen AB24 3FX, Scotland; National Health Service Grampian (NHSG), Aberdeen Royal Infirmary, Aberdeen, Scotland (S.J.C., R.T.S., G.L.); Kheiron Medical Technologies, London, England (J.Y., D.D.); and School of Medicine, University of St Andrews, St Andrews, Scotland (D.J.H.).

Published: May 2023

Artificial intelligence (AI) tools may assist breast screening mammography programs, but limited evidence supports their generalizability to new settings. This retrospective study used a 3-year dataset (April 1, 2016-March 31, 2019) from a U.K. regional screening program. The performance of a commercially available breast screening AI algorithm was assessed with a prespecified and site-specific decision threshold to evaluate whether its performance was transferable to a new clinical site. The dataset consisted of women (aged approximately 50-70 years) who attended routine screening, excluding self-referrals, those with complex physical requirements, those who had undergone a previous mastectomy, and those who underwent screening that had technical recalls or did not have the four standard image views. In total, 55 916 screening attendees (mean age, 60 years ± 6 [SD]) met the inclusion criteria. The prespecified threshold resulted in high recall rates (48.3%, 21 929 of 45 444), which reduced to 13.0% (5896 of 45 444) following threshold calibration, closer to the observed service level (5.0%, 2774 of 55 916). Recall rates also increased approximately threefold following a software upgrade on the mammography equipment, requiring per-software version thresholds. Using software-specific thresholds, the AI algorithm would have recalled 277 of 303 (91.4%) screen-detected cancers and 47 of 138 (34.1%) interval cancers. AI performance and thresholds should be validated for new clinical settings before deployment, while quality assurance systems should monitor AI performance for consistency. Breast, Screening, Mammography, Computer Applications-Detection/Diagnosis, Neoplasms-Primary, Technology Assessment © RSNA, 2023.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10245180PMC
http://dx.doi.org/10.1148/ryai.220146DOI Listing

Publication Analysis

Top Keywords

breast screening
12
artificial intelligence
8
screening
8
screening mammography
8
recall rates
8
performance
5
impact mammography
4
mammography systems
4
systems artificial
4
intelligence performance
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!