Cross-modality face image synthesis such as sketch-to-photo, NIR-to-RGB, and RGB-to-depth has wide applications in face recognition, face animation, and digital entertainment. Conventional cross-modality synthesis methods usually require paired training data, i.e., each subject has images of both modalities. However, paired data can be difficult to acquire, while unpaired data commonly exist. In this paper, we propose a novel semi-supervised cross-modality synthesis method (namely CMOS-GAN), which can leverage both paired and unpaired face images to learn a robust cross-modality synthesis model. Specifically, CMOS-GAN uses a generator of encoder-decoder architecture for new modality synthesis. We leverage pixel-wise loss, adversarial loss, classification loss, and face feature loss to exploit the information from both paired multi-modality face images and unpaired face images for model learning. In addition, since we expect the synthetic new modality can also be helpful for improving face recognition accuracy, we further use a modified triplet loss to retain the discriminative features of the subject in the synthetic modality. Experiments on three cross-modality face synthesis tasks (NIR-to-VIS, RGB-to-depth, and sketch-to-photo) show the effectiveness of the proposed approach compared with the state-of-the-art. In addition, we also collect a large-scale RGB-D dataset (VIPL-MumoFace-3K) for the RGB-to-depth synthesis task. We plan to open-source our code and VIPL-MumoFace-3K dataset to the community.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2022.3226413DOI Listing

Publication Analysis

Top Keywords

cross-modality face
12
cross-modality synthesis
12
face images
12
face
10
face image
8
synthesis
8
image synthesis
8
face recognition
8
unpaired face
8
synthetic modality
8

Similar Publications

This study investigated how exposure to Caucasian and Chinese faces influences native Mandarin-Chinese speakers' learning of emotional meanings for English L2 words. Participants were presented with English pseudowords repeatedly paired with either Caucasian faces or Chinese faces showing emotions of disgust, sadness, or neutrality as a control baseline. Participants' learning was evaluated through both within-modality (i.

View Article and Find Full Text PDF

Given the diversity of medical images, traditional image segmentation models face the issue of domain shift. Unsupervised domain adaptation (UDA) methods have emerged as a pivotal strategy for cross modality analysis. These methods typically utilize generative adversarial networks (GANs) for both image-level and feature-level domain adaptation through the transformation and reconstruction of images, assuming the features between domains are well-aligned.

View Article and Find Full Text PDF

In the face of complex scenarios, the information insufficiency of classification tasks dominated by a single modality has led to a bottleneck in classification performance. The joint application of multimodal remote sensing data for surface observation tasks has garnered widespread attention. However, issues such as sample differences between modalities and the lack of correlation in physical features have limited the performance of classification tasks.

View Article and Find Full Text PDF

Cross-modality recognition has many important applications in science, law enforcement and entertainment. Popular methods to bridge the modality gap include reducing the distributional differences of representations of different modalities, learning indistinguishable representations or explicit modality transfer. The first two approaches suffer from the loss of discriminant information while removing the modality-specific variations.

View Article and Find Full Text PDF

When and how to survey potential respondents is often determined by budgetary and external constraints, but choice of survey modality may have enormous implications for data quality. Different survey modalities may be differentially susceptible to measurement error attributable to interviewer assignment, known as interviewer effects. In this paper, we leverage highly similar surveys, one conducted face-to-face (FTF) and the other via phone, to examine variation in interviewer effects across survey modality and question type.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!