Background: Cohen's kappa is a statistic that estimates interobserver agreement. It was originally introduced to help develop diagnostic tests. Interpretative readings of 2 observers, for example, of a mammogram or other imaging, were compared at a single point in time. It is known that kappa depends on the prevalence of disease and that, therefore, kappas across different settings are hard to compare.
Methods: Using simulation, we examine an analogous situation, not previously described, that occurs in clinical trials where sequential measurements are obtained to evaluate disease progression or clinical improvement over time.
Results: We show that weighted kappa, used for multilevel outcomes, changes during the trial even if we keep the performance of the observer constant.
Conclusions: Kappa and closely related measures can therefore only be used with great difficulty, if at all, in quality assurance in clinical trials.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1007/s43441-019-00102-5 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!