Fine-Grained Self-Supervised Learning with Jigsaw puzzles for medical image classification.

Comput Biol Med

Department of Software, Ajou University, Republic of Korea; Department of Computer Engineering, Ajou University, Republic of Korea. Electronic address:

Published: May 2024

Classifying fine-grained lesions is challenging due to minor and subtle differences in medical images. This is because learning features of fine-grained lesions with highly minor differences is very difficult in training deep neural networks. Therefore, in this paper, we introduce Fine-Grained Self-Supervised Learning(FG-SSL) method for classifying subtle lesions in medical images. The proposed method progressively learns the model through hierarchical block such that the cross-correlation between the fine-grained Jigsaw puzzle and regularized original images is close to the identity matrix. We also apply hierarchical block for progressive fine-grained learning, which extracts different information in each step, to supervised learning for discovering subtle differences. Our method does not require an asymmetric model, nor does a negative sampling strategy, and is not sensitive to batch size. We evaluate the proposed fine-grained self-supervised learning method on comprehensive experiments using various medical image recognition datasets. In our experiments, the proposed method performs favorably compared to existing state-of-the-art approaches on the widely-used ISIC2018, APTOS2019, and ISIC2017 datasets.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiomed.2024.108460DOI Listing

Publication Analysis

Top Keywords

fine-grained self-supervised
12
self-supervised learning
8
medical image
8
fine-grained lesions
8
subtle differences
8
medical images
8
proposed method
8
hierarchical block
8
fine-grained
7
learning
5

Similar Publications

Traditional multimodal contrastive learning brings text and its corresponding image closer together as a positive pair, where the text typically consists of fixed sentence structures or specific descriptive statements, and the image features are generally global features (with some fine-grained work using local features). Similar to unimodal self-supervised contrastive learning, this approach can be seen as enforcing a strict identity constraint in a multimodal context. However, due to the inherent complexity of remote sensing images, which cannot be easily described in a single sentence, and the fact that remote sensing images contain rich ancillary information beyond just object features, this strict identity constraint may be insufficient.

View Article and Find Full Text PDF
Article Synopsis
  • Diffusion models can improve image generation in specialized fields like histopathology and satellite imagery by utilizing self-supervised learning (SSL) embeddings as stand-ins for human labels, which are hard to obtain.
  • This new method allows for high-quality images to be created from these embeddings, and it can even generate larger images by combining smaller patches while maintaining their spatial consistency.
  • The approach enhances classifier performance on both small patch-level and larger scale classification tasks and shows strong adaptability, successfully working with unseen datasets and different input sources, including text descriptions for image synthesis.
View Article and Find Full Text PDF

Foundation vision-language models are currently transforming computer vision, and are on the rise in medical imaging fueled by their very promising generalization capabilities. However, the initial attempts to transfer this new paradigm to medical imaging have shown less impressive performances than those observed in other domains, due to the significant domain shift and the complex, expert domain knowledge inherent to medical-imaging tasks. Motivated by the need for domain-expert foundation models, we present FLAIR, a pre-trained vision-language model for universal retinal fundus image understanding.

View Article and Find Full Text PDF

Molecular representation learning is of great importance for drug molecular analysis. The development in molecular representation learning has demonstrated great promise through self-supervised pre-training strategy to overcome the scarcity of labeled molecular property data. Recent studies concentrate on pre-training molecular representation encoders by integrating both 2D topological and 3D geometric structures.

View Article and Find Full Text PDF

Semi-supervised Action Quality Assessment (AQA) using limited labeled and massive unlabeled samples to achieve high-quality assessment is an attractive but challenging task. The main challenge relies on how to exploit solid and consistent representations of action sequences for building a bridge between labeled and unlabeled samples in the semi-supervised AQA. To address the issue, we propose a Self-supervised sub-Action Parsing Network (SAP-Net) that employs a teacher-student network structure to learn consistent semantic representations between labeled and unlabeled samples for semi-supervised AQA.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!