Fine-art recognition using convolutional transformers.

PeerJ Comput Sci

School of Arts, Chongqing University, Chongqing, China.

Published: October 2024

AI Article Synopsis

  • * The development of an improved recognition system using convolutional transformers and an attention mechanism enhances focused learning compared to existing methods.
  • * Comparative experiments showed that the new model outperformed four popular pre-trained networks (ResNet50, VGG16, AlexNet, ViT), demonstrating the effectiveness of convolutional transformers for image feature learning.

Article Abstract

Digital image processing is a constantly evolving field encompassing a wide range of techniques and applications. Researchers worldwide are continually developing various algorithms across multiple fields to achieve accurate image classification. Advanced computer vision algorithms are crucial for architectural and artistic analysis. The digitalization of art has significantly enhanced the accessibility and conservation of fine-art paintings, yet the risk of art theft remains a significant challenge. Improving art security necessitates the precise identification of fine-art paintings. Although current recognition systems have shown potential, there is significant scope for enhancing their efficiency. We developed an improved recognition system for categorizing fine-art paintings using convolutional transformers, specified by an attention mechanism to enhance focused learning on the data. As part of the most advanced architectures in the deep learning family, transformers are empowered by a multi-head attention mechanism, thus improving learning efficiency. To assess the performance of our model, we compared it with those developed using four pre-trained networks: ResNet50, VGG16, AlexNet, and ViT. Each pre-trained network was integrated into a corresponding state-of-the-art model as the first processing blocks. These four state-of-the-art models were constructed under the transfer learning strategy, one of the most commonly used approaches in this field. The experimental results showed that our proposed system outperformed the other models. Our study also highlighted the effectiveness of using convolutional transformers for learning image features.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622946PMC
http://dx.doi.org/10.7717/peerj-cs.2409DOI Listing

Publication Analysis

Top Keywords

convolutional transformers
12
fine-art paintings
12
attention mechanism
8
learning
5
fine-art
4
fine-art recognition
4
recognition convolutional
4
transformers
4
transformers digital
4
digital image
4

Similar Publications

Objective: Detecting and measuring changes in longitudinal fundus imaging is key to monitoring disease progression in chronic ophthalmic diseases, such as glaucoma and macular degeneration. Clinicians assess changes in disease status by either independently reviewing or manually juxtaposing longitudinally acquired color fundus photos (CFPs). Distinguishing variations in image acquisition due to camera orientation, zoom, and exposure from true disease-related changes can be challenging.

View Article and Find Full Text PDF

Motor Imagery (MI) electroencephalographic (EEG) signal classification is a pioneer research branch essential for mobility rehabilitation. This paper proposes an end-to-end hybrid deep network "Spatio Temporal Inception Transformer Network (STIT-Net)" model for MI classification. Discrete Wavelet Transform (DWT) is used to derive the alpha (8-13) Hz and beta (13-30) Hz EEG sub bands which are dominant during motor tasks to enhance the performance of the proposed work.

View Article and Find Full Text PDF

Single-pixel imaging (SPI) using deep learning networks, e.g., convolutional neural networks (CNNs) and vision transformers (ViTs), has made significant progress.

View Article and Find Full Text PDF

Background: Early detection and diagnosis of cancer are vital to improving outcomes for patients. Artificial intelligence (AI) models have shown promise in the early detection and diagnosis of cancer, but there is limited evidence on methods that fully exploit the longitudinal data stored within electronic health records (EHRs). This review aims to summarise methods currently utilised for prediction of cancer from longitudinal data and provides recommendations on how such models should be developed.

View Article and Find Full Text PDF

Dilated SE-DenseNet for brain tumor MRI classification.

Sci Rep

January 2025

Department of Applied Mathematics, University of Waterloo, Waterloo, ON, N2L 3G1, Canada.

In the field of medical imaging, particularly MRI-based brain tumor classification, we propose an advanced convolutional neural network (CNN) leveraging the DenseNet-121 architecture, enhanced with dilated convolutional layers and Squeeze-and-Excitation (SE) networks' attention mechanisms. This novel approach aims to improve upon state-of-the-art methods of tumor identification. Our model, trained and evaluated on a comprehensive Kaggle brain tumor dataset, demonstrated superior performance over established convolution-based and transformer-based models: ResNet-101, VGG-19, original DenseNet-121, MobileNet-V2, ViT-L/16, and Swin-B across key metrics: F1-score, accuracy, precision, and recall.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!