AI Article Synopsis

  • * This study introduces a novel approach using two coupled convolutional neural networks (CNNs) to learn the reliability of individual annotators and the consensus label from noisy observations, aiming to differentiate between unreliable behaviors of annotators and accurate data representations.
  • * The proposed method was validated on various datasets, including simulated and real-world medical images, consistently outperforming other strategies, especially when faced with fewer annotations and greater disagreements among annotators.

Article Abstract

Supervised machine learning methods have been widely developed for segmentation tasks in recent years. However, the quality of labels has high impact on the predictive performance of these algorithms. This issue is particularly acute in the medical image domain, where both the cost of annotation and the inter-observer variability are high. Different human experts contribute estimates of the "actual" segmentation labels in a typical label acquisition process, influenced by their personal biases and competency levels. The performance of automatic segmentation algorithms is limited when these noisy labels are used as the expert consensus label. In this work, we use two coupled CNNs to jointly learn, from purely noisy observations alone, the reliability of individual annotators and the expert consensus label distributions. The separation of the two is achieved by maximally describing the annotator's "unreliable behavior" (we call it "maximally unreliable") while achieving high fidelity with the noisy training data. We first create a toy segmentation dataset using MNIST and investigate the properties of the proposed algorithm. We then use three public medical imaging segmentation datasets to demonstrate our method's efficacy, including both simulated (where necessary) and real-world annotations: 1) ISBI2015 (multiple-sclerosis lesions); 2) BraTS (brain tumors); 3) LIDC-IDRI (lung abnormalities). Finally, we create a real-world multiple sclerosis lesion dataset (QSMSC at UCL: Queen Square Multiple Sclerosis Center at UCL, UK) with manual segmentations from 4 different annotators (3 radiologists with different level skills and 1 expert to generate the expert consensus label). In all datasets, our method consistently outperforms competing methods and relevant baselines, especially when the number of annotations is small and the amount of disagreement is large. The studies also reveal that the system is capable of capturing the complicated spatial characteristics of annotators' mistakes.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10533416PMC
http://dx.doi.org/10.1016/j.patcog.2023.109400DOI Listing

Publication Analysis

Top Keywords

expert consensus
12
consensus label
12
medical image
8
multiple sclerosis
8
segmentation
6
learning multiple
4
multiple annotators
4
annotators medical
4
image segmentation
4
segmentation supervised
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!