Talking Face Generation With Audio-Deduced Emotional Landmarks.

Shuyan Zhai Meng Liu Yongqiang Li Zan Gao Lei Zhu Liqiang Nie

IEEE Trans Neural Netw Learn Syst

Published: October 2024

The goal of talking face generation is to synthesize a sequence of face images of the specified identity, ensuring the mouth movements are synchronized with the given audio. Recently, image-based talking face generation has emerged as a popular approach. It could generate talking face images synchronized with the audio merely depending on a facial image of arbitrary identity and an audio clip. Despite the accessible input, it forgoes the exploitation of the audio emotion, inducing the generated faces to suffer from emotion unsynchronization, mouth inaccuracy, and image quality deficiency. In this article, we build a bistage audio emotion-aware talking face generation (AMIGO) framework, to generate high-quality talking face videos with cross-modally synced emotion. Specifically, we propose a sequence-to-sequence (seq2seq) cross-modal emotional landmark generation network to generate vivid landmarks, whose lip and emotion are both synchronized with input audio. Meantime, we utilize a coordinated visual emotion representation to improve the extraction of the audio one. In stage two, a feature-adaptive visual translation network is designed to translate the synthesized landmarks into facial images. Concretely, we proposed a feature-adaptive transformation module to fuse the high-level representations of landmarks and images, resulting in significant improvement in image quality. We perform extensive experiments on the multi-view emotional audio-visual dataset (MEAD) and crowd-sourced emotional multimodal actors dataset (CREMA-D) benchmark datasets, demonstrating that our model outperforms state-of-the-art benchmarks.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TNNLS.2023.3274676	DOI Listing

Publication Analysis

Top Keywords

talking face

face generation

face images

synchronized audio

image quality

audio

talking

face

generation

emotion

Similar Publications

Protocol for a multisite randomised controlled trial assessing the effect of the Telehealth for Early Developmental Intervention in babies born very preterm (TEDI-Prem) programme on neurodevelopmental outcomes and parent well-being.

BMJ Open

December 2024

Clinical Sciences, Murdoch Children's Research Institute, Melbourne, Victoria, Australia.

Abbey L Eeles Alicia J Spittle Stacey Dusing Peter J Anderson Shaaron Brown

Introduction: Infants born very preterm (VPT, <32 weeks' gestation) are at increased risk for neurodevelopmental impairments including motor, cognitive and behavioural delay. Parents of infants born VPT also have poorer mental health outcomes compared with parents of infants born at term.We have developed an intervention programme called TEDI-Prem (Telehealth for Early Developmental Intervention in babies born very preterm) based on previous research.

View Article and Find Full Text PDF

Similar Publications

Mobile health in communication disorders: willingness to use, attitude, advantages, and challenges from the perspective of patients.

BMC Health Serv Res

January 2025

Department of Speech and Language Pathology, School of Rehabilitation Sciences, Hamadan University of Medical Sciences, Hamadan, Iran.

Taleb Khodaveisi Hamid Bouraghi Soheila Saeedi Marjan Ghazisaeedi Mohammad-Sadegh Seifpanahi

Introduction: Communication disorders are one of the most common disorders that, if not treated in childhood, can cause many social, educational, and psychological problems in adulthood. One of the technologies that can be helpful in these disorders is mobile health (m-Health) technology. This study aims to examine the attitude and willingness to use this technology and compare the advantages and challenges of this technology and face-to-face treatment from the perspective of patients.

View Article and Find Full Text PDF

Similar Publications

Evaluation of the effectiveness of a serious game titled "Kookism" on the receptive lexicon in 4-9-year-old autistic children.

Heliyon

January 2025

Department of Clinical Psychology, School of Behavioral Sciences and Mental Health (Tehran Institute of Psychiatry), Iran University of Mental Science, Tehran, Iran.

Elham Hesami Nour Mohammad Bakhshani Maryam Arabpour Dahouie Younes Zaheri

Background: Autistic children often face difficulties with semantic skills such as receptive lexicon. Games based on behavioral principles have been emphasized for treating autistic children. Serious Games are a new and effective way to alleviate deficits in autistic children.

View Article and Find Full Text PDF

Similar Publications

VPT: Video portraits transformer for realistic talking face generation.

Neural Netw

January 2025

School of Automation Science and Engineering, South China University of Technology, China. Electronic address:

Zhijun Zhang Jian Zhang Weijian Mai

Talking face generation is a promising approach within various domains, such as digital assistants, video editing, and virtual video conferences. Previous works with audio-driven talking faces focused primarily on the synchronization between audio and video. However, existing methods still have certain limitations in synthesizing photo-realistic video with high identity preservation, audiovisual synchronization, and facial details like blink movements.

View Article and Find Full Text PDF

Similar Publications

Two Decades of the Walking While Talking Test: A Narrative Review.

J Am Med Dir Assoc

January 2025

Department of Neurology, Renaissance School of Medicine, Stony Brook, NY, United States.

Kelly Cotton Joe Verghese

Objectives: Early research reported that older adults who stopped walking when they began a conversation were more likely to fall in the future. As a systematic measure of dual-task performance, Verghese and colleagues developed the Walking While Talking (WWT) test, in which a person walks at a normal pace while reciting alternate letters of the alphabet. The present paper highlights key findings from the 2 decades of research using the WWT test.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!