Objective: To compare the reliability and validity of two classification systems used to evaluate the quality of mammograms: PGMI ('perfect', 'good', 'moderate' and 'inadequate') and EAR ('excellent', 'acceptable' and 'repeat').
Setting: New South Wales (Australia) population-based mammography screening programme (BreastScreen NSW).
Methods: Thirty sets of mammograms were rated by 21 radiographers and an expert panel. PGMI and EAR criteria were used to assign ratings to the medio-lateral oblique (MLO) and cranio-caudal (CC) views for each set of films. Inter-observer reliability and criterion validity (compared with expert panel ratings) were assessed using mean weighted observed agreement and kappa statistics.
Results: Reliability: Kappa values for both classification systems were low (0.01-0.17). PGMI produced significantly higher values than EAR. Agreement between raters was higher using PGMI than EAR for the MLO view (77% versus 74%, P < 0.05), but was similar for the CC view. Dichotomized ratings ('acceptable' or 'needs repeating') did not improve reliability estimates.
Validity: Kappa values between raters and the reference standard were low for both classification systems (0.05-0.15). Agreement between raters and the reference standard was higher using PGMI than EAR for the MLO view (74% versus 63%), but was similar for the CC view. Dichotomized ratings of the MLO view showed slightly higher observer agreement.
Conclusions: Both PGMI and EAR have poor reliability and validity in evaluating mammogram quality. EAR is not a suitable alternative to PGMI, which must be improved if it is to be useful.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1258/0969141053279149 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!