Ultrasound Image Representation Learning by Modeling Sonographer Visual Attention.

Richard Droste Yifan Cai Harshita Sharma Pierre Chatelain Lior Drukker Aris T Papageorghiou J Alison Noble

Inf Process Med Imaging

Department of Engineering Science, University of Oxford, UK.

Published: June 2019

Image representations are commonly learned from class labels, which are a simplistic approximation of human image understanding. In this paper we demonstrate that transferable representations of images can be learned without manual annotations by modeling human visual attention. The basis of our analyses is a unique gaze tracking dataset of sonographers performing routine clinical fetal anomaly screenings. Models of sonographer visual attention are learned by training a convolutional neural network (CNN) to predict gaze on ultrasound video frames through visual saliency prediction or gaze-point regression. We evaluate the transferability of the learned representations to the task of ultrasound standard plane detection in two contexts. Firstly, we perform transfer learning by fine-tuning the CNN with a limited number of labeled standard plane images. We find that fine-tuning the saliency predictor is superior to training from random initialization, with an average F1-score improvement of 9.6% overall and 15.3% for the cardiac planes. Secondly, we train a simple softmax regression on the feature activations of each CNN layer in order to evaluate the representations independently of transfer learning hyper-parameters. We find that the attention models derive strong representations, approaching the precision of a fully-supervised baseline model for all but the last layer.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6986905	PMC
http://dx.doi.org/10.1007/978-3-030-20351-1_46	DOI Listing

Publication Analysis

Top Keywords

visual attention

sonographer visual

standard plane

transfer learning

representations

ultrasound image

image representation

representation learning

learning modeling

modeling sonographer

Similar Publications

Leveraging two-dimensional pre-trained vision transformers for three-dimensional model generation via masked autoencoders.

Sci Rep

January 2025

Department of Electrical Power, Adama Science and Technology University, Adama, 1888, Ethiopia.

Muhammad Sajid Kaleem Razzaq Malik Ateeq Ur Rehman Tauqeer Safdar Malik Masoud Alajmi

Although the Transformer architecture has established itself as the industry standard for jobs involving natural language processing, it still has few uses in computer vision. In vision, attention is used in conjunction with convolutional networks or to replace individual convolutional network elements while preserving the overall network design. Differences between the two domains, such as significant variations in the scale of visual things and the higher granularity of pixels in images compared to words in the text, make it difficult to transfer Transformer from language to vision.

View Article and Find Full Text PDF

Similar Publications

Colorectal cancer detection with enhanced precision using a hybrid supervised and unsupervised learning approach.

Sci Rep

January 2025

Ministry of Higher Education, Mataria Technical College, Cairo, 11718, Egypt.

Akella S Narasimha Raju K Venkatesh Ranjith Kumar Gatla Eswara Prasad Konakalla Marwa M Eid

The current work introduces the hybrid ensemble framework for the detection and segmentation of colorectal cancer. This framework will incorporate both supervised classification and unsupervised clustering methods to present more understandable and accurate diagnostic results. The method entails several steps with CNN models: ADa-22 and AD-22, transformer networks, and an SVM classifier, all inbuilt.

View Article and Find Full Text PDF

Similar Publications

Clinical features, diagnosis, management, and prognosis of circumscribed choroidal hemangioma.

Surv Ophthalmol

January 2025

Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China; Key Lab of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing 100730, China. Electronic address:

Zuyi Yang Dianzhe Tian Zhixuan Xie Tiantian Cheng Youxin Chen

Because of its benign nature and rarity, circumscribed choroidal hemangioma (CCH) often receives limited attention, leading to a high rate of misdiagnosis and a lack of standardized treatment protocols. We provide a thorough clarification of the demographics, clinical features, diagnosis, management, and prognosis of CCH. We conducted a systematic search of the PubMed, EMBASE, and Ovid databases up to December, 2023, to identify relevant studies.

View Article and Find Full Text PDF

Similar Publications

Discovery of an Enzyme-Activated Fluorogenic Probe for Profiling of Acylaminoacyl-Peptide Hydrolase.

Anal Chem

January 2025

Department of Laboratory Medicine, School of Medicine, Yangtze University, Jingzhou 434023, P.R. China.

Shi-Yu Liu Huiling Wang Yue-Yang Zhang Le-Yu Huang

Acylaminoacyl-peptide hydrolase (APEH), a serine peptidase that belongs to the prolyl oligopeptidase (POP) family, catalyzes removal of N-terminal acetylated amino acid residues from peptides. As a key regulator of protein N-terminal acetylation, APEH was involved in many important physiological processes while its aberrant expression was correlated with progression of various diseases such as inflammation, diabetics, Alzheimer's disease (AD), and cancers. However, while emerging attention has been attracted in APEH-related disease diagnosis and drug discovery, the mechanisms behind APEH and related disease progression are still unclear; thus, further investigating the physiological role and function of APEH is of great importance.

View Article and Find Full Text PDF

Similar Publications

An Explainable CNN and Vision Transformer-Based Approach for Real-Time Food Recognition.

Nutrients

January 2025

Department of Computer Engineering, Inje University, Gimhae 50834, Republic of Korea.

Kintoh Allen Nfor Tagne Poupi Theodore Armand Kenesbaeva Periyzat Ismaylovna Moon-Il Joo Hee-Cheol Kim

Background: Food image recognition, a crucial step in computational gastronomy, has diverse applications across nutritional platforms. Convolutional neural networks (CNNs) are widely used for this task due to their ability to capture hierarchical features. However, they struggle with long-range dependencies and global feature extraction, which are vital in distinguishing visually similar foods or images where the context of the whole dish is crucial, thus necessitating transformer architecture.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!