Kappa statistics have been widely used in the pathology literature to compare interobserver diagnostic variability (IOV) among different pathologists but there has been limited discussion about the clinical significance of kappa scores. Five representative and recent pathology papers were queried using clinically relevant specific questions to learn how IOV was evaluated and how the clinical applicability of results was interpreted. The papers supported our anecdotal impression that pathologists usually assess IOV using Cohen's or Fleiss' kappa statistics and interpret the results using some variation of the scale proposed by Landis and Koch. The papers did not cite or propose specific guidelines to comment on the clinical applicability of results. The solutions proposed to decrease IOV included the development of better diagnostic criteria and additional educational efforts, but the possibility that the entities themselves represented a continuum of morphologic findings rather than distinct diagnostic categories was not considered in any of the studies. A dataset from a previous study of IOV reported by Thunnissen et al. was recalculated to estimate percent agreement among 19 international lung pathologists for the diagnosis of 74 challenging lung neuroendocrine neoplasms. Kappa scores and diagnostic sensitivity, specificity, positive and negative predictive values were calculated using the majority consensus diagnosis for each case as the gold reference diagnosis for that case. Diagnostic specificity estimates among multiple pathologists were > 90%, although kappa scores were considerably more variable. We explain why kappa scores are of limited clinical applicability in pathology and propose the use of positive and negative percent agreement and diagnostic specificity against a gold reference diagnosis to evaluate IOV among two and multiple raters, respectively.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.anndiagpath.2020.151561DOI Listing

Publication Analysis

Top Keywords

kappa scores
16
percent agreement
12
diagnostic specificity
12
clinical applicability
12
agreement diagnostic
8
kappa statistics
8
positive negative
8
diagnosis case
8
gold reference
8
reference diagnosis
8

Similar Publications

Significance: Optimal meibography utilization and interpretation are hindered due to poor lid presentation, blurry images, or image artifacts and the challenges of applying clinical grading scales. These results, using the largest image dataset analyzed to date, demonstrate development of algorithms that provide standardized, real-time inference that addresses all of these limitations.

Purpose: This study aimed to develop and validate an algorithmic pipeline to automate and standardize meibomian gland absence assessment and interpretation.

View Article and Find Full Text PDF

Objectives: To investigate the image quality and diagnostic performance with ultra-low dose dual-layer detector spectral CT (DLSCT) by various reconstruction techniques for evaluation of pulmonary nodules.

Materials And Methods: Between April 2023 and December 2023, patients with suspected pulmonary nodules were prospectively enrolled and underwent regular-dose chest CT (RDCT; 120 kVp/automatic tube current) and ultra-low dose CT (ULDCT; 100 kVp/10 mAs) on a DLSCT scanner. ULDCT was reconstructed with hybrid iterative reconstruction (HIR), electron density map (EDM), and virtual monoenergetic images at 40 keV and 70 keV.

View Article and Find Full Text PDF

Background: Gastrointestinal ultrasound (GIUS) is recommended for monitoring Crohn's disease (CD). GIUS scores are used to quantify CD activity. Among them, IBUS-SAS (International Bowel Ultrasound Segmental Activity Score), BUSS (Bowel Ultrasound Score), Simple-US (Simple Ultrasound Score), and SUS-CD (Simple Ultrasound Score for Crohn's Disease) are most commonly used.

View Article and Find Full Text PDF

Objective: The leptomeningeal ivy sign is a distinctive finding of moyamoya disease (MMD), characterized by a linear high signal intensity along the cortical sulci on contrast-enhanced T1 magnetic resonance imaging (MRI) and fluid-attenuated inversion-recovery MRI. We recently identified a similar linear enhancement along the cortical sulci using gadolinium-enhanced vessel wall MRI (VWMR) in patients with MMD. The aim of this study was to introduce the concept of the "VWMR ivy sign (VIS)".

View Article and Find Full Text PDF

Background And Objectives: The likelihood and severity of neurodevelopmental impairment (NDI) affects critical health care decisions. NDI definitions were developed without parental perspectives. We investigated the agreement between parental vs medical classification of NDI among children born preterm.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!