Appl Psychol Meas
June 2017
Many indices of interrater agreement on binary tasks have been proposed to assess reliability, but none has escaped criticism. In a series of Monte Carlo simulations, five such indices were evaluated using , an unbiased indicator of raters' ability to distinguish between the true presence or absence of the characteristic being judged. and, to a lesser extent, coefficients performed best across variations in characteristic prevalence, and raters' expertise and bias.
View Article and Find Full Text PDF