The authors demonstrated that the most common statistical significance test used with r(WG)-type interrater agreement indexes in applied psychology, based on the chi-square distribution, is flawed and inaccurate. The chi-square test is shown to be extremely conservative even for modest, standard significance levels (e.g., .05). The authors present an alternative statistical significance test, based on Monte Carlo procedures, that produces the equivalent of an approximate randomization test for the null hypothesis that the actual distribution of responding is rectangular and demonstrate its superiority to the chi-square test. Finally, the authors provide tables of critical values and offer downloadable software to implement the approximate randomization test for r(WG)-type and for average deviation (AD)-type interrater agreement indexes. The implications of these results for studying a broad range of interrater agreement problems in applied psychology are discussed.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1037/0021-9010.88.2.356 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!