Multimodal Emotion Recognition on RAVDESS Dataset Using Transfer Learning.

Cristina Luna-Jiménez David Griol Zoraida Callejas Ricardo Kleinlein Juan M Montero Fernando Fernández-Martínez

Sensors (Basel)

Grupo de Tecnología del Habla y Aprendizaje Automático (THAU Group), Information Processing and Telecommunications Center, E.T.S.I. de Telecomunicación, Universidad Politécnica de Madrid, Avda. Complutense 30, 28040 Madrid, Spain.

Published: November 2021

Emotion Recognition is attracting the attention of the research community due to the multiple areas where it can be applied, such as in healthcare or in road safety systems. In this paper, we propose a multimodal emotion recognition system that relies on speech and facial information. For the speech-based modality, we evaluated several transfer-learning techniques, more specifically, embedding extraction and Fine-Tuning. The best accuracy results were achieved when we fine-tuned the CNN-14 of the PANNs framework, confirming that the training was more robust when it did not start from scratch and the tasks were similar. Regarding the facial emotion recognizers, we propose a framework that consists of a pre-trained Spatial Transformer Network on saliency maps and facial images followed by a bi-LSTM with an attention mechanism. The error analysis reported that the frame-based systems could present some problems when they were used directly to solve a video-based task despite the domain adaptation, which opens a new line of research to discover new ways to correct this mismatch and take advantage of the embedded knowledge of these pre-trained models. Finally, from the combination of these two modalities with a late fusion strategy, we achieved 80.08% accuracy on the RAVDESS dataset on a subject-wise 5-CV evaluation, classifying eight emotions. The results revealed that these modalities carry relevant information to detect users' emotional state and their combination enables improvement of system performance.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8618559	PMC
http://dx.doi.org/10.3390/s21227665	DOI Listing

Publication Analysis

Top Keywords

emotion recognition

multimodal emotion

ravdess dataset

recognition ravdess

dataset transfer

transfer learning

learning emotion

recognition attracting

attracting attention

attention community

Similar Publications

Five-year trajectories of symptom severity, physical and mental functioning in patients with persistent somatic symptoms: the PROSPECTS cohort study.

BMJ Open

January 2025

Amsterdam Public Health research institute, Amsterdam, The Netherlands.

Hieke Barends Henriëtte E van der Horst Johannes C van der Wouden Nikki Claassen Joost Dekker

Objectives: Knowledge about the long-term course and prognosis of persistent somatic symptoms (PSS) is important to improve clinical decision-making and guidance for patients with PSS. Therefore, we aimed to: (1) identify distinct 5-year trajectories of symptom severity, physical and mental functioning in adult patients with PSS and (2) explore patient characteristics associated with these trajectories.

Design: We used longitudinal data (seven measurements over a 5-year period) of the PROSPECTS study: a prospective cohort of adult patients with PSS.

View Article and Find Full Text PDF

Similar Publications

An EEG-Based Emotion Recognition Method by Fusing Multi-Frequency-Spatial Features Under Multi-Frequency Bands.

J Neurosci Methods

January 2025

College of Automation and Electronic Engineering, Qingdao University of Science and Technology, Qingdao, China.

Qiuyu Chen Xiaoqian Mao Yuebin Song Kefa Wang

Background: Recognition of emotion changes is of great significance to a person's physical and mental health. At present, EEG-based emotion recognition methods are mainly focused on time or frequency domains, but rarely on spatial information. Therefore, the goal of this study is to improve the performance of emotion recognition by integrating frequency and spatial domain information under multi-frequency bands.

View Article and Find Full Text PDF

Similar Publications

Reliability and validity of four cognitive interpretation bias measures in the context of social anxiety.

Behav Res Methods

January 2025

Department of Clinical Psychology, Utrecht University, Utrecht, the Netherlands.

Sascha B Duken Jun Moriya Colette Hirsch Marcella L Woud Bram van Bockstaele

People with social anxiety disorder tend to interpret ambiguous social information in a negative rather than positive manner. Such interpretation biases may cause and maintain anxiety symptoms. However, there is considerable variability in the observed effects across studies, with some not finding a relationship between interpretation biases and social anxiety.

View Article and Find Full Text PDF

Similar Publications

Dynamic domain adaptive EEG emotion recognition based on multi-source selection.

Rev Sci Instrum

January 2025

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, Shaanxi 710121, China.

Zhongmin Wang Mengxuan Zhao

Emotion recognition based on electroencephalogram (EEG) has always been a research hotspot. However, due to significant individual variations in EEG signals, cross-subject emotion recognition based on EEG remains a challenging issue to address. In this article, we propose a dynamic domain-adaptive EEG emotion recognition method based on multi-source selection.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!