In many different domains, experts can make complex decisions after glancing very briefly at an image. However, the perceptual mechanisms underlying expert performance are still largely unknown. Recently, several machine learning algorithms have been shown to outperform human experts in specific tasks. But these algorithms often behave as black boxes and their information processing pipeline remains unknown. This lack of transparency and interpretability is highly problematic in applications involving human lives, such as health care. One way to "open the black box" is to compute an artificial attention map from the model, which highlights the pixels of the input image that contributed the most to the model decision. In this work, we directly compare human visual attention to machine visual attention when performing the same visual task. We have designed a medical diagnosis task involving the detection of lesions in small bowel endoscopic images. We collected eye movements from novices and gastroenterologist experts while they classified medical images according to their relevance for Crohn's disease diagnosis. We trained three state-of-the-art deep learning models on our carefully labeled dataset. Both humans and machine performed the same task. We extracted artificial attention with six different post hoc methods. We show that the model attention maps are significantly closer to human expert attention maps than to novices', especially for pathological images. As the model gets trained and its performance gets closer to the human experts, the similarity between model and human attention increases. Through the understanding of the similarities between the visual decision-making process of human experts and deep neural networks, we hope to inform both the training of new doctors and the architecture of new algorithms.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11008746PMC
http://dx.doi.org/10.1167/jov.24.4.6DOI Listing

Publication Analysis

Top Keywords

human experts
12
attention
9
deep neural
8
human
8
human attention
8
artificial attention
8
visual attention
8
attention maps
8
closer human
8
experts
5

Similar Publications

Background: Early Childhood Education and Care (ECEC) centers play an important role in fostering healthy dietary habits. The Nutrition Now project focusing on improving dietary habits during the first 1000 days of life. Central to the project is the implementation of an e-learning resource aimed at promoting feeding practices among staff and healthy dietary behaviours for children aged 0-3 years in ECEC.

View Article and Find Full Text PDF

Objectives: Pulp stones are ectopic calcifications located in pulp tissue. The aim of this study is to introduce a novel method for detecting pulp stones on panoramic radiography images using a deep learning-based two-stage pipeline architecture.

Materials And Methods: The first stage involved tooth localization with the YOLOv8 model, followed by pulp stone classification using ResNeXt.

View Article and Find Full Text PDF

This study presents TOM500, a comprehensive multi-organ annotated orbital magnetic resonance imaging (MRI) dataset. It includes clinical data, T2-weighted MRI scans, and corresponding segmentations from 500 patients with thyroid eye disease (TED) during their initial visit. TED is a common autoimmune disorder with distinct orbital MRI features.

View Article and Find Full Text PDF

Background: After lumbar spine surgery, a Core Outcome Set (COS) for acute pain is essential to ensure that the most meaningful outcomes are monitored consistently in the perioperative period. The aim of the present study was to consent on a COS for assessing the efficacy of acute pain management for patients undergoing lumbar spinal surgery.

Method: A modified Delphi procedure was conducted among a national (Dutch) expert panel.

View Article and Find Full Text PDF

Purpose: The development of the Diabetic Wound Assessment Learning Tool (DiWALT) has previously been described. However, an examination of its application to a larger, more heterogeneous group of participants is lacking. In order to allow for a more robust assessment of the psychometric properties of the DiWALT, we applied it to a broader group of participants.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!