Multi-Modality Multi-Attribute Contrastive Pre-Training for Image Aesthetics Computing.

Yipo Huang Leida Li Pengfei Chen Haoning Wu Weisi Lin Guangming Shi

IEEE Trans Pattern Anal Mach Intell

Published: November 2024

In the Image Aesthetics Computing (IAC) field, most prior methods leveraged the off-the-shelf backbones pre-trained on the large-scale ImageNet database. While these pre-trained backbones have achieved notable success, they often overemphasize object-level semantics and fail to capture the high-level concepts of image aesthetics, which may only achieve suboptimal performances. To tackle this long-neglected problem, we propose a multi-modality multi-attribute contrastive pre-training framework, targeting at constructing an alternative to ImageNet-based pre-training for IAC. Specifically, the proposed framework consists of two main aspects. (1) We build a multi-attribute image description database with human feedback, leveraging the competent image understanding capability of the multi-modality large language model to generate rich aesthetic descriptions. (2) To better adapt models to aesthetic computing tasks, we integrate the image-based visual features with the attribute-based text features, and map the integrated features into different embedding spaces, based on which the multi-attribute contrastive learning is proposed for obtaining more comprehensive aesthetic representation. To alleviate the distribution shift encountered when transitioning from the general visual domain to the aesthetic domain, we further propose a semantic affinity loss to restrain the content information and enhance model generalization. Extensive experiments demonstrate that the proposed framework sets new state-of-the-arts for IAC tasks. The code, database and pre-trained weights will be available at https://github.com/yipoh/AesNet.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TPAMI.2024.3492259	DOI Listing

Publication Analysis

Top Keywords

multi-attribute contrastive

image aesthetics

multi-modality multi-attribute

contrastive pre-training

aesthetics computing

database pre-trained

proposed framework

image

pre-training image

computing image

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!