Publications by authors named "Chaoyou Fu"

Recently, the multimodal large language model (MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful large language models (LLMs) as a brain to perform multimodal tasks. The surprising emergent capabilities of the MLLM, such as writing stories based on images and optical character recognition-free math reasoning, are rare in traditional multimodal methods, suggesting a potential path to artificial general intelligence. To this end, both academia and industry have endeavored to develop MLLMs that can compete with or even outperform GPT-4V, pushing the limit of research at a surprising speed.

View Article and Find Full Text PDF

Cross-spectral face hallucination is an intuitive way to mitigate the modality discrepancy in Heterogeneous Face Recognition (HFR). However, due to imaging differences, the hallucination inevitably suffers from a shape misalignment between paired heterogeneous images. Rather than building complicated architectures to circumvent the problem like previous works, we propose a simple yet effective method called Shape Alignment FacE (SAFE).

View Article and Find Full Text PDF

Heterogeneous face recognition (HFR) refers to matching cross-domain faces and plays a crucial role in public security. Nevertheless, HFR is confronted with challenges from large domain discrepancy and insufficient heterogeneous data. In this paper, we formulate HFR as a dual generation problem, and tackle it via a novel dual variational generation (DVG-Face) framework.

View Article and Find Full Text PDF