A multi-dilated convolution network for speech emotion recognition.

Samaneh Madanian Olayinka Adeleye John Michael Templeton Talen Chen Christian Poellabauer Enshi Zhang Sandra L Schneider

Sci Rep

Department of Communication Sciences and Disorders, Saint Mary's College, Notre Dame, IN, USA.

Published: March 2025

Speech emotion recognition (SER) is an important application in Affective Computing and Artificial Intelligence. Recently, there has been a significant interest in Deep Neural Networks using speech spectrograms. As the two-dimensional representation of the spectrogram includes more speech characteristics, research interest in convolution neural networks (CNNs) or advanced image recognition models is leveraged to learn deep patterns in a spectrogram to effectively perform SER. Accordingly, in this study, we propose a novel SER model based on the learning of the utterance-level spectrogram. First, we use the Spatial Pyramid Pooling (SPP) strategy to remove the size constraint associated with the CNN-based image recognition task. Then, the SPP layer is deployed to extract both the global-level prominent feature vector and multi-local-level feature vector, followed by an attention model to weigh the feature vectors. Finally, we apply the ArcFace layer, typically used for face recognition, to the SER task, thereby obtaining improved SER performance. Our model achieved an unweighted accuracy of 67.9% on IEMOCAP and 77.6% on EMODB datasets.

Download full-text PDF	Source
http://dx.doi.org/10.1038/s41598-025-92640-2	DOI Listing

Publication Analysis

Top Keywords

speech emotion

emotion recognition

recognition ser

neural networks

image recognition

feature vector

recognition

ser

multi-dilated convolution

convolution network

Similar Publications

Violence against healthcare workers in Kenya: a cross-sectional sub-analysis of the global vishwas study.

BMC Public Health

March 2025

Department of Internal Medicine, Aga Khan University, Nairobi, Kenya.

Reena Shah Jasmit Shah Rakhtan K Qasba Ruman K Qasba Annastacia Mbithi

Background: Global concern exists for workplace violence against healthcare workers (HCWs), especially in low and middle-income nations. This violence includes physical, verbal, or sexual abuse and has a significant impact despite initiatives like Occupational Safety and Health Administration (OSHA) guidelines. We conducted a study in Kenya to address this issue.

View Article and Find Full Text PDF

Similar Publications

A multi-dilated convolution network for speech emotion recognition.

Sci Rep

March 2025

Department of Communication Sciences and Disorders, Saint Mary's College, Notre Dame, IN, USA.

Samaneh Madanian Olayinka Adeleye John Michael Templeton Talen Chen Christian Poellabauer

View Article and Find Full Text PDF

Similar Publications

Does music training improve emotion recognition and cognitive abilities? Longitudinal and correlational evidence from children.

Cognition

March 2025

Centro de Investigação e Intervenção Social (CIS-IUL), Instituto Universitário de Lisboa (ISCTE-IUL), Lisboa, Portugal. Electronic address:

Leonor Neves Marta Martins Ana Isabel Correia São Luís Castro E Glenn Schellenberg

Music training is widely claimed to enhance nonmusical abilities, yet causal evidence remains inconclusive. Moreover, research tends to focus on cognitive over socioemotional outcomes. In two studies, we investigated whether music training improves emotion recognition in voices and faces among school-aged children.

View Article and Find Full Text PDF

Similar Publications

Design and analysis of teaching early warning system based on multimodal data in an intelligent learning environment.

PeerJ Comput Sci

March 2025

School of Education, Shaanxi Normal University, Xi'an, Shaanxi, China.

Xinxin Kang Yong Nie

In online teaching environments, the lack of direct emotional interaction between teachers and students poses challenges for teachers to consciously and effectively manage their emotional expressions. The design and implementation of an early warning system for teaching provide a novel approach to intelligent evaluation and improvement of online education. This study focuses on segmenting different emotional segments and recognizing emotions in instructional videos.

View Article and Find Full Text PDF

Similar Publications

Effects of an elimination diet and a healthy diet in children with Attention-Deficit/Hyperactivity Disorder: 1-Year prospective follow-up of a two-arm randomized, controlled study (TRACE study).

JCPP Adv

March 2025

Karakter Child and Adolescent Psychiatry Nijmegen The Netherlands.

Annick Huberts-Bosch Margreet Bierens Julia J Rucklidge Verena Ly Rogier Donders

Background: An Elimination Diet (ED) or Healthy Diet (HD) may be effective in reducing symptoms of Attention-Deficit/Hyperactivity Disorder (ADHD), but long-term maintenance effects and feasibility have never been examined.

Methods: One-year prospective follow-up of a sample of 165 children (5-12 years) with ADHD randomized (unblinded; 1:1) to 5 weeks treatment with either ED ( = 84) or HD ( = 81) and a non-randomized comparator arm including 58 children being treated with Care as Usual (CAU). Dietary participants were allowed to add or switch to CAU treatment after 5 weeks.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!