Fine-art recognition using convolutional transformers.

PeerJ Comput Sci

School of Arts, Chongqing University, Chongqing, China.

Published: October 2024

* The development of an improved recognition system using convolutional transformers and an attention mechanism enhances focused learning compared to existing methods.
* Comparative experiments showed that the new model outperformed four popular pre-trained networks (ResNet50, VGG16, AlexNet, ViT), demonstrating the effectiveness of convolutional transformers for image feature learning.

Digital image processing is a constantly evolving field encompassing a wide range of techniques and applications. Researchers worldwide are continually developing various algorithms across multiple fields to achieve accurate image classification. Advanced computer vision algorithms are crucial for architectural and artistic analysis. The digitalization of art has significantly enhanced the accessibility and conservation of fine-art paintings, yet the risk of art theft remains a significant challenge. Improving art security necessitates the precise identification of fine-art paintings. Although current recognition systems have shown potential, there is significant scope for enhancing their efficiency. We developed an improved recognition system for categorizing fine-art paintings using convolutional transformers, specified by an attention mechanism to enhance focused learning on the data. As part of the most advanced architectures in the deep learning family, transformers are empowered by a multi-head attention mechanism, thus improving learning efficiency. To assess the performance of our model, we compared it with those developed using four pre-trained networks: ResNet50, VGG16, AlexNet, and ViT. Each pre-trained network was integrated into a corresponding state-of-the-art model as the first processing blocks. These four state-of-the-art models were constructed under the transfer learning strategy, one of the most commonly used approaches in this field. The experimental results showed that our proposed system outperformed the other models. Our study also highlighted the effectiveness of using convolutional transformers for learning image features.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622946	PMC
http://dx.doi.org/10.7717/peerj-cs.2409	DOI Listing

Publication Analysis

Top Keywords

convolutional transformers

fine-art paintings

attention mechanism

learning

fine-art

fine-art recognition

recognition convolutional

transformers

transformers digital

digital image

Similar Publications

EyeLiner: A Deep Learning Pipeline for Longitudinal Image Registration Using Fundus Landmarks.

Ophthalmol Sci

November 2024

Department of Ophthalmology, University of Colorado Anschutz Medical Campus, Aurora, Colorado.

Yoga Advaith Veturi Steve McNamara Scott Kinder Christopher William Clark Upasana Thakuria

Objective: Detecting and measuring changes in longitudinal fundus imaging is key to monitoring disease progression in chronic ophthalmic diseases, such as glaucoma and macular degeneration. Clinicians assess changes in disease status by either independently reviewing or manually juxtaposing longitudinally acquired color fundus photos (CFPs). Distinguishing variations in image acquisition due to camera orientation, zoom, and exposure from true disease-related changes can be challenging.

View Article and Find Full Text PDF

Similar Publications

STIT-Net- A Wavelet based Convolutional Transformer Model for Motor Imagery EEG Signal Classification in the Sensorimotor Bands.

Clin EEG Neurosci

January 2025

Department of Electronics and Communication Engineering, Mepco Schlenk Engineering College, Sivakasi, India.

Chrisilla S R Shantha SelvaKumari

Motor Imagery (MI) electroencephalographic (EEG) signal classification is a pioneer research branch essential for mobility rehabilitation. This paper proposes an end-to-end hybrid deep network "Spatio Temporal Inception Transformer Network (STIT-Net)" model for MI classification. Discrete Wavelet Transform (DWT) is used to derive the alpha (8-13) Hz and beta (13-30) Hz EEG sub bands which are dominant during motor tasks to enhance the performance of the proposed work.

View Article and Find Full Text PDF

Similar Publications

Mobile-friendly under-sampling single-pixel imaging based on a lightweight hybrid CNN-ViT architecture.

Opt Express

December 2024

Wenjie Jiang Jinze Song Zexi Chen Shuai Qu

Single-pixel imaging (SPI) using deep learning networks, e.g., convolutional neural networks (CNNs) and vision transformers (ViTs), has made significant progress.

View Article and Find Full Text PDF

Similar Publications

Artificial intelligence methods applied to longitudinal data from electronic health records for prediction of cancer: a scoping review.

BMC Med Res Methodol

January 2025

Leeds Institute of Clinical Trials Research, University of Leeds, Clarendon Way, Leeds, LS2 9NL, UK.

Victoria Moglia Owen Johnson Gordon Cook Marc de Kamps Lesley Smith

Background: Early detection and diagnosis of cancer are vital to improving outcomes for patients. Artificial intelligence (AI) models have shown promise in the early detection and diagnosis of cancer, but there is limited evidence on methods that fully exploit the longitudinal data stored within electronic health records (EHRs). This review aims to summarise methods currently utilised for prediction of cancer from longitudinal data and provides recommendations on how such models should be developed.

View Article and Find Full Text PDF

Similar Publications

Dilated SE-DenseNet for brain tumor MRI classification.

Sci Rep

January 2025

Department of Applied Mathematics, University of Waterloo, Waterloo, ON, N2L 3G1, Canada.

Yuannong Mao Jiwook Kim Lena Podina Mohammad Kohandel

In the field of medical imaging, particularly MRI-based brain tumor classification, we propose an advanced convolutional neural network (CNN) leveraging the DenseNet-121 architecture, enhanced with dilated convolutional layers and Squeeze-and-Excitation (SE) networks' attention mechanisms. This novel approach aims to improve upon state-of-the-art methods of tumor identification. Our model, trained and evaluated on a comprehensive Kaggle brain tumor dataset, demonstrated superior performance over established convolution-based and transformer-based models: ResNet-101, VGG-19, original DenseNet-121, MobileNet-V2, ViT-L/16, and Swin-B across key metrics: F1-score, accuracy, precision, and recall.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!