Image representations are commonly learned from class labels, which are a simplistic approximation of human image understanding. In this paper we demonstrate that transferable representations of images can be learned without manual annotations by modeling human visual attention. The basis of our analyses is a unique gaze tracking dataset of sonographers performing routine clinical fetal anomaly screenings. Models of sonographer visual attention are learned by training a convolutional neural network (CNN) to predict gaze on ultrasound video frames through visual saliency prediction or gaze-point regression. We evaluate the transferability of the learned representations to the task of ultrasound standard plane detection in two contexts. Firstly, we perform transfer learning by fine-tuning the CNN with a limited number of labeled standard plane images. We find that fine-tuning the saliency predictor is superior to training from random initialization, with an average F1-score improvement of 9.6% overall and 15.3% for the cardiac planes. Secondly, we train a simple softmax regression on the feature activations of each CNN layer in order to evaluate the representations independently of transfer learning hyper-parameters. We find that the attention models derive strong representations, approaching the precision of a fully-supervised baseline model for all but the last layer.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6986905 | PMC |
http://dx.doi.org/10.1007/978-3-030-20351-1_46 | DOI Listing |
Sci Rep
January 2025
Department of Electrical Power, Adama Science and Technology University, Adama, 1888, Ethiopia.
Although the Transformer architecture has established itself as the industry standard for jobs involving natural language processing, it still has few uses in computer vision. In vision, attention is used in conjunction with convolutional networks or to replace individual convolutional network elements while preserving the overall network design. Differences between the two domains, such as significant variations in the scale of visual things and the higher granularity of pixels in images compared to words in the text, make it difficult to transfer Transformer from language to vision.
View Article and Find Full Text PDFSci Rep
January 2025
Ministry of Higher Education, Mataria Technical College, Cairo, 11718, Egypt.
The current work introduces the hybrid ensemble framework for the detection and segmentation of colorectal cancer. This framework will incorporate both supervised classification and unsupervised clustering methods to present more understandable and accurate diagnostic results. The method entails several steps with CNN models: ADa-22 and AD-22, transformer networks, and an SVM classifier, all inbuilt.
View Article and Find Full Text PDFSurv Ophthalmol
January 2025
Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, 100730, China; Key Lab of Ocular Fundus Diseases, Chinese Academy of Medical Sciences, Beijing 100730, China. Electronic address:
Because of its benign nature and rarity, circumscribed choroidal hemangioma (CCH) often receives limited attention, leading to a high rate of misdiagnosis and a lack of standardized treatment protocols. We provide a thorough clarification of the demographics, clinical features, diagnosis, management, and prognosis of CCH. We conducted a systematic search of the PubMed, EMBASE, and Ovid databases up to December, 2023, to identify relevant studies.
View Article and Find Full Text PDFAnal Chem
January 2025
Department of Laboratory Medicine, School of Medicine, Yangtze University, Jingzhou 434023, P.R. China.
Acylaminoacyl-peptide hydrolase (APEH), a serine peptidase that belongs to the prolyl oligopeptidase (POP) family, catalyzes removal of N-terminal acetylated amino acid residues from peptides. As a key regulator of protein N-terminal acetylation, APEH was involved in many important physiological processes while its aberrant expression was correlated with progression of various diseases such as inflammation, diabetics, Alzheimer's disease (AD), and cancers. However, while emerging attention has been attracted in APEH-related disease diagnosis and drug discovery, the mechanisms behind APEH and related disease progression are still unclear; thus, further investigating the physiological role and function of APEH is of great importance.
View Article and Find Full Text PDFNutrients
January 2025
Department of Computer Engineering, Inje University, Gimhae 50834, Republic of Korea.
Background: Food image recognition, a crucial step in computational gastronomy, has diverse applications across nutritional platforms. Convolutional neural networks (CNNs) are widely used for this task due to their ability to capture hierarchical features. However, they struggle with long-range dependencies and global feature extraction, which are vital in distinguishing visually similar foods or images where the context of the whole dish is crucial, thus necessitating transformer architecture.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!