CMOS-GAN: Semi-supervised Generative Adversarial Model for Cross-Modality Face Image Synthesis.

Shikang Yu Hu Han Shiguang Shan Xilin Chen

IEEE Trans Image Process

Published: December 2022

Cross-modality face image synthesis such as sketch-to-photo, NIR-to-RGB, and RGB-to-depth has wide applications in face recognition, face animation, and digital entertainment. Conventional cross-modality synthesis methods usually require paired training data, i.e., each subject has images of both modalities. However, paired data can be difficult to acquire, while unpaired data commonly exist. In this paper, we propose a novel semi-supervised cross-modality synthesis method (namely CMOS-GAN), which can leverage both paired and unpaired face images to learn a robust cross-modality synthesis model. Specifically, CMOS-GAN uses a generator of encoder-decoder architecture for new modality synthesis. We leverage pixel-wise loss, adversarial loss, classification loss, and face feature loss to exploit the information from both paired multi-modality face images and unpaired face images for model learning. In addition, since we expect the synthetic new modality can also be helpful for improving face recognition accuracy, we further use a modified triplet loss to retain the discriminative features of the subject in the synthetic modality. Experiments on three cross-modality face synthesis tasks (NIR-to-VIS, RGB-to-depth, and sketch-to-photo) show the effectiveness of the proposed approach compared with the state-of-the-art. In addition, we also collect a large-scale RGB-D dataset (VIPL-MumoFace-3K) for the RGB-to-depth synthesis task. We plan to open-source our code and VIPL-MumoFace-3K dataset to the community.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TIP.2022.3226413	DOI Listing

Publication Analysis

Top Keywords

cross-modality face

cross-modality synthesis

face images

face

face image

synthesis

image synthesis

face recognition

unpaired face

synthetic modality

Similar Publications

Faces of different socio-cultural identities impact emotional meaning learning for L2 words.

Sci Rep

January 2025

Instituto Universitario de Neurociencia (IUNE), Universidad de La Laguna, La Laguna, Spain.

Beixian Gu Xiaobing Sun David Beltrán Manuel de Vega

This study investigated how exposure to Caucasian and Chinese faces influences native Mandarin-Chinese speakers' learning of emotional meanings for English L2 words. Participants were presented with English pseudowords repeatedly paired with either Caucasian faces or Chinese faces showing emotions of disgust, sadness, or neutrality as a control baseline. Participants' learning was evaluated through both within-modality (i.

View Article and Find Full Text PDF

Similar Publications

Cross-Modality Medical Image Segmentation via Enhanced Feature Alignment and Cross Pseudo Supervision Learning.

Diagnostics (Basel)

August 2024

The Departments of Radiology, Shengli Clinical Medical College of Fujian Medical University, Fuzhou 350001, China.

Mingjing Yang Zhicheng Wu Hanyu Zheng Liqin Huang Wangbin Ding

Given the diversity of medical images, traditional image segmentation models face the issue of domain shift. Unsupervised domain adaptation (UDA) methods have emerged as a pivotal strategy for cross modality analysis. These methods typically utilize generative adversarial networks (GANs) for both image-level and feature-level domain adaptation through the transformation and reconstruction of images, assuming the features between domains are well-aligned.

View Article and Find Full Text PDF

Similar Publications

Joint Classification of Hyperspectral Images and LiDAR Data Based on Dual-Branch Transformer.

Sensors (Basel)

January 2024

School of Measurement-Control and Communication Engineering, Harbin University of Science and Technology, Harbin 150080, China.

Qingyan Wang Binbin Zhou Junping Zhang Jinbao Xie Yujing Wang

In the face of complex scenarios, the information insufficiency of classification tasks dominated by a single modality has led to a bottleneck in classification performance. The joint application of multimodal remote sensing data for surface observation tasks has garnered widespread attention. However, issues such as sample differences between modalities and the lack of correlation in physical features have limited the performance of classification tasks.

View Article and Find Full Text PDF

Similar Publications

Mind the Gap: Learning Modality-Agnostic Representations With a Cross-Modality UNet.

IEEE Trans Image Process

January 2024

Xin Niu Enyi Li Jinchao Liu Yan Wang Margarita Osadchy

Cross-modality recognition has many important applications in science, law enforcement and entertainment. Popular methods to bridge the modality gap include reducing the distributional differences of representations of different modalities, learning indistinguishable representations or explicit modality transfer. The first two approaches suffer from the loss of discriminant information while removing the modality-specific variations.

View Article and Find Full Text PDF

Similar Publications

When Who Matters: Interviewer Effects and Survey Modality.

Eval Rev

December 2024

Independent Evaluator, Kampala, Uganda.

Rebecca Walcott Isabelle Cohen Denise Ferris

When and how to survey potential respondents is often determined by budgetary and external constraints, but choice of survey modality may have enormous implications for data quality. Different survey modalities may be differentially susceptible to measurement error attributable to interviewer assignment, known as interviewer effects. In this paper, we leverage highly similar surveys, one conducted face-to-face (FTF) and the other via phone, to examine variation in interviewer effects across survey modality and question type.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!