Objective: Voice disorders significantly compromise individuals' ability to speak in their daily lives. Without early diagnosis and treatment, these disorders may deteriorate drastically. Thus, automatic classification systems at home are desirable for people who are inaccessible to clinical disease assessments. However, the performance of such systems may be weakened due to the constrained resources and domain mismatch between the clinical data and noisy real-world data.
Methods: This study develops a compact and domain-robust voice disorder classification system to identify the utterances of health, neoplasm, and benign structural diseases. Our proposed system utilizes a feature extractor model composed of factorized convolutional neural networks and subsequently deploys domain adversarial training to reconcile the domain mismatch by extracting domain-invariant features.
Results: The results show that the unweighted average recall in the noisy real-world domain improved by 13% and remained at 80% in the clinic domain with only slight degradation. The domain mismatch was effectively eliminated. Moreover, the proposed system reduced the usage of both memory and computation by over 73.9%.
Conclusion: By deploying factorized convolutional neural networks and domain adversarial training, domain-invariant features can be derived for voice disorder classification with limited resources. The promising results confirm that the proposed system can significantly reduce resource consumption and improve classification accuracy by considering the domain mismatch.
Significance: To the best of our knowledge, this is the first study that jointly considers real-world model compression and noise-robustness issues in voice disorder classification. The proposed system is intended for application to embedded systems with limited resources.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TBME.2023.3270532 | DOI Listing |
Front Psychiatry
December 2024
Insititute of Psychology, SWPS University, Warsaw, Poland.
Introduction: In recent years there has been a notable expansion of psychotherapeutic approaches to treat people experiencing auditory verbal hallucinations (AVH). While many psychotherapists conceptualize voices as "dissociative parts" and apply therapeutic techniques derived from the field of dissociation, research investigating AVH from this perspective is limited. Despite the acknowledgment that voices encountered in dissociative identity disorder (DID) often exhibit high complexity and autonomy, there is a critical need for assessment tools capable of exploring voice complexity across different clinical groups.
View Article and Find Full Text PDFJ Voice
December 2024
Carl von Ossietzky University Oldenburg, University Clinic for Visceral Surgery, Ammerländer Heerstraße 114-118, 26129 Oldenburg, Germany.
Objective: The care of patients with dysphonia will change due to the growing shortage of specialists, demographic change, and digitalization. To counteract the associated problems in patient care, the LAOLA app demonstrator is to be developed. In the future, patients will receive exercise videos for their training from their treating speech and language pathologist (SLP) via LAOLA.
View Article and Find Full Text PDFJ Voice
December 2024
Department of Speech-Language Pathology and the Graduate Program in Medical Sciences, Universidade de Brasília, Brasília, Distrito Federal, Brazil. Electronic address:
Objectives: To analyze the prevalence of pediatric voice disorders.
Study Design: Systematic review (SR) and meta-analysis.
Methods: The research question of this SR was "What is the prevalence of dysphonia in children?" An electronic search was performed using the Medical Literature Analysis and Retrieval System online (Medline), Literatura Latino-Americana e do Caribe em Ciências da Saúde, EMBASE, Web of Science, and SCOPUS database.
J Acoust Soc Am
December 2024
Department of Otorhinolaryngology and Head & Neck Surgery, Osaka University Graduate School of Medicine, Osaka 565-0871, Japan.
The fundamental frequency (fo) is pivotal for quantifying vocal-fold characteristics. However, the accuracy of fo estimation in hoarse voices is notably low, and no definitive algorithm for fo estimation has been previously established. In this study, we introduce an algorithm named, "Spectral-based fo Estimator Emphasized by Domination and Sequence (SFEEDS)," which enhances the spectrum method and conducted comparative analyses with conventional estimation methods.
View Article and Find Full Text PDFJ Eat Disord
December 2024
School of Social Sciences, Arts Design and Architecture, University of New South Wales, Sydney, Australia.
Purpose: To examine autonomy within treatment and recovery from longstanding and severe eating disorders (EDs).
Background: The typically early age of onset, high incidence, and prolonged duration of EDs, has a high personal, relational, and financial burden for people who experience them. Current treatment practices rely on the exertion of external control and influence which has profound impacts on people living with EDs as well as the relationship and interactions between them and their treating professionals.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!