A generalizable speech emotion recognition model reveals depression and remission.

Acta Psychiatr Scand

Cognitive Science, School of Communication and Culture, Aarhus University, Aarhus, Denmark.

Published: February 2022

Objective: Affective disorders are associated with atypical voice patterns; however, automated voice analyses suffer from small sample sizes and untested generalizability on external data. We investigated a generalizable approach to aid clinical evaluation of depression and remission from voice using transfer learning: We train machine learning models on easily accessible non-clinical datasets and test them on novel clinical data in a different language.

Methods: A Mixture of Experts machine learning model was trained to infer happy/sad emotional state using three publicly available emotional speech corpora in German and US English. We examined the model's predictive ability to classify the presence of depression on Danish speaking healthy controls (N = 42), patients with first-episode major depressive disorder (MDD) (N = 40), and the subset of the same patients who entered remission (N = 25) based on recorded clinical interviews. The model was evaluated on raw, de-noised, and speaker-diarized data.

Results: The model showed separation between healthy controls and depressed patients at the first visit, obtaining an AUC of 0.71. Further, speech from patients in remission was indistinguishable from that of the control group. Model predictions were stable throughout the interview, suggesting that 20-30 s of speech might be enough to accurately screen a patient. Background noise (but not speaker diarization) heavily impacted predictions.

Conclusion: A generalizable speech emotion recognition model can effectively reveal changes in speaker depressive states before and after remission in patients with MDD. Data collection settings and data cleaning are crucial when considering automated voice analysis for clinical purposes.

Download full-text PDF

Source
http://dx.doi.org/10.1111/acps.13388DOI Listing

Publication Analysis

Top Keywords

generalizable speech
8
speech emotion
8
emotion recognition
8
recognition model
8
depression remission
8
automated voice
8
machine learning
8
healthy controls
8
model
6
remission
5

Similar Publications

Localization of function within the brain and central nervous system is an essential aspect of clinical neuroscience. Classical descriptions of functional neuroanatomy provide a foundation for understanding the functional significance of identifiable anatomic structures. However, individuals exhibit substantial variation, particularly in the presence of disorders that alter tissue structure or impact function.

View Article and Find Full Text PDF

Background: Cochlear implants (CI) with off-the-ear (OTE) and behind-the-ear (BTE) speech processors differ in user experience and audiological performance, impacting speech perception, comfort, and satisfaction.

Objectives: This systematic review explores audiological outcomes (speech perception in quiet and noise) and non-audiological factors (device handling, comfort, cosmetics, overall satisfaction) of OTE and BTE speech processors in CI recipients.

Methods: We conducted a systematic review following PRISMA-S guidelines, examining Medline, Embase, Cochrane Library, Scopus, and ProQuest Dissertations and Theses.

View Article and Find Full Text PDF

Introduction: People with schizophrenia spectrum disorders present with language dysfunctions, yet we know little about their use of reference markers (indefinite markers, definite markers, pronouns or names), a fundamental aspect of efficient speech production.

Methods: Twenty-five (25) participants with a recent-onset schizophrenia spectrum disorder (SZ) and 25 healthy controls (HC) completed two referential communication tasks. The tasks involved presenting to an interaction partner a series of movie characters (character identification task) and movie scenes composed of six images (narration task).

View Article and Find Full Text PDF

Factors Influencing Adolescents' Knowledge, Practices, and Attitudes Towards Oral Health in the Rupa-Rupa District, Peru.

J Int Soc Prev Community Dent

December 2024

Scientific Research Department, Research Group in Dental Sciences, School of Dentistry, Universidad Científica del Sur, Lima, Perú.

Aim: This study aimed to identify factors associated with adolescents' knowledge, practices, and attitudes (KPA-OH) regarding oral health in the Rupa-Rupa district, a high jungle region of Peru.

Materials And Methods: An analytical study was conducted with a sample of 408 adolescents (aged 13-17 years) from seven public schools in the Rupa-Rupa district (elevation: 649 meters above sea level). The sample was stratified by sex, age, and school.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!