Correlation Between Screening Mammography Interpretive Performance on a Test Set and Performance in Clinical Practice.

Diana L Miglioretti Laura Ichikawa Robert A Smith Diana S M Buist Patricia A Carney Berta Geller Barbara Monsees Tracy Onega Robert Rosenberg Edward A Sickles Bonnie C Yankaskas Karla Kerlikowske

Acad Radiol

Departments of Medicine and Epidemiology and Biostatistics, University of California, San Francisco, California; General Internal Medicine Section, Department of Veterans Affairs, University of California, San Francisco, California.

Published: October 2017

Rationale And Objectives: Evidence is inconsistent about whether radiologists' interpretive performance on a screening mammography test set reflects their performance in clinical practice. This study aimed to estimate the correlation between test set and clinical performance and determine if the correlation is influenced by cancer prevalence or lesion difficulty in the test set.

Materials And Methods: This institutional review board-approved study randomized 83 radiologists from six Breast Cancer Surveillance Consortium registries to assess one of four test sets of 109 screening mammograms each; 48 radiologists completed a fifth test set of 110 mammograms 2 years later. Test sets differed in number of cancer cases and difficulty of lesion detection. Test set sensitivity and specificity were estimated using woman-level and breast-level recall with cancer status and expert opinion as gold standards. Clinical performance was estimated using women-level recall with cancer status as the gold standard. Spearman rank correlations between test set and clinical performance with 95% confidence intervals (CI) were estimated.

Results: For test sets with fewer cancers (N = 15) that were more difficult to detect, correlations were weak to moderate for sensitivity (woman level = 0.46, 95% CI = 0.16, 0.69; breast level = 0.35, 95% CI = 0.03, 0.61) and weak for specificity (0.24, 95% CI = 0.01, 0.45) relative to expert recall. Correlations for test sets with more cancers (N = 30) were close to 0 and not statistically significant.

Conclusions: Correlations between screening performance on a test set and performance in clinical practice are not strong. Test set performance more accurately reflects performance in clinical practice if cancer prevalence is low and lesions are challenging to detect.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5591765	PMC
http://dx.doi.org/10.1016/j.acra.2017.03.016	DOI Listing

Publication Analysis

Top Keywords

test set

performance clinical

clinical practice

test sets

test

set performance

clinical performance

performance

screening mammography

interpretive performance

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!