Speech emotion recognition (SER) is an important application in Affective Computing and Artificial Intelligence. Recently, there has been a significant interest in Deep Neural Networks using speech spectrograms. As the two-dimensional representation of the spectrogram includes more speech characteristics, research interest in convolution neural networks (CNNs) or advanced image recognition models is leveraged to learn deep patterns in a spectrogram to effectively perform SER. Accordingly, in this study, we propose a novel SER model based on the learning of the utterance-level spectrogram. First, we use the Spatial Pyramid Pooling (SPP) strategy to remove the size constraint associated with the CNN-based image recognition task. Then, the SPP layer is deployed to extract both the global-level prominent feature vector and multi-local-level feature vector, followed by an attention model to weigh the feature vectors. Finally, we apply the ArcFace layer, typically used for face recognition, to the SER task, thereby obtaining improved SER performance. Our model achieved an unweighted accuracy of 67.9% on IEMOCAP and 77.6% on EMODB datasets.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1038/s41598-025-92640-2 | DOI Listing |
BMC Public Health
March 2025
Department of Internal Medicine, Aga Khan University, Nairobi, Kenya.
Background: Global concern exists for workplace violence against healthcare workers (HCWs), especially in low and middle-income nations. This violence includes physical, verbal, or sexual abuse and has a significant impact despite initiatives like Occupational Safety and Health Administration (OSHA) guidelines. We conducted a study in Kenya to address this issue.
View Article and Find Full Text PDFSci Rep
March 2025
Department of Communication Sciences and Disorders, Saint Mary's College, Notre Dame, IN, USA.
Speech emotion recognition (SER) is an important application in Affective Computing and Artificial Intelligence. Recently, there has been a significant interest in Deep Neural Networks using speech spectrograms. As the two-dimensional representation of the spectrogram includes more speech characteristics, research interest in convolution neural networks (CNNs) or advanced image recognition models is leveraged to learn deep patterns in a spectrogram to effectively perform SER.
View Article and Find Full Text PDFCognition
March 2025
Centro de Investigação e Intervenção Social (CIS-IUL), Instituto Universitário de Lisboa (ISCTE-IUL), Lisboa, Portugal. Electronic address:
Music training is widely claimed to enhance nonmusical abilities, yet causal evidence remains inconclusive. Moreover, research tends to focus on cognitive over socioemotional outcomes. In two studies, we investigated whether music training improves emotion recognition in voices and faces among school-aged children.
View Article and Find Full Text PDFPeerJ Comput Sci
March 2025
School of Education, Shaanxi Normal University, Xi'an, Shaanxi, China.
In online teaching environments, the lack of direct emotional interaction between teachers and students poses challenges for teachers to consciously and effectively manage their emotional expressions. The design and implementation of an early warning system for teaching provide a novel approach to intelligent evaluation and improvement of online education. This study focuses on segmenting different emotional segments and recognizing emotions in instructional videos.
View Article and Find Full Text PDFJCPP Adv
March 2025
Karakter Child and Adolescent Psychiatry Nijmegen The Netherlands.
Background: An Elimination Diet (ED) or Healthy Diet (HD) may be effective in reducing symptoms of Attention-Deficit/Hyperactivity Disorder (ADHD), but long-term maintenance effects and feasibility have never been examined.
Methods: One-year prospective follow-up of a sample of 165 children (5-12 years) with ADHD randomized (unblinded; 1:1) to 5 weeks treatment with either ED ( = 84) or HD ( = 81) and a non-randomized comparator arm including 58 children being treated with Care as Usual (CAU). Dietary participants were allowed to add or switch to CAU treatment after 5 weeks.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!