Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach.

J Voice

Department of Otolaryngology Head and Neck Surgery, Far Eastern Memorial Hospital, Taipei, Taiwan; Department of Special Education, University of Taipei, Taipei, Taiwan; Department of Otolaryngology Head and Neck Surgery, National Taiwan University College of Medicine, Taipei, Taiwan. Electronic address:

Published: September 2019

Objectives: Computerized detection of voice disorders has attracted considerable academic and clinical interest in the hope of providing an effective screening method for voice diseases before endoscopic confirmation. This study proposes a deep-learning-based approach to detect pathological voice and examines its performance and utility compared with other automatic classification algorithms.

Methods: This study retrospectively collected 60 normal voice samples and 402 pathological voice samples of 8 common clinical voice disorders in a voice clinic of a tertiary teaching hospital. We extracted Mel frequency cepstral coefficients from 3-second samples of a sustained vowel. The performances of three machine learning algorithms, namely, deep neural network (DNN), support vector machine, and Gaussian mixture model, were evaluated based on a fivefold cross-validation. Collective cases from the voice disorder database of MEEI (Massachusetts Eye and Ear Infirmary) were used to verify the performance of the classification mechanisms.

Results: The experimental results demonstrated that DNN outperforms Gaussian mixture model and support vector machine. Its accuracy in detecting voice pathologies reached 94.26% and 90.52% in male and female subjects, based on three representative Mel frequency cepstral coefficient features. When applied to the MEEI database for validation, the DNN also achieved a higher accuracy (99.32%) than the other two classification algorithms.

Conclusions: By stacking several layers of neurons with optimized weights, the proposed DNN algorithm can fully utilize the acoustic features and efficiently differentiate between normal and pathological voice samples. Based on this pilot study, future research may proceed to explore more application of DNN from laboratory and clinical perspectives.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jvoice.2018.02.003DOI Listing

Publication Analysis

Top Keywords

pathological voice
16
voice samples
12
voice
11
voice disorders
8
mel frequency
8
frequency cepstral
8
support vector
8
vector machine
8
gaussian mixture
8
mixture model
8

Similar Publications

Clinical Manifestations.

Alzheimers Dement

December 2024

Amsterdam Neuroscience, Neurodegeneration, Amsterdam, Netherlands.

Background: Automated analysis of natural speech is emerging as a promising digital biomarker of Alzheimer's disease (AD). As speech is a complex process, relying on multiple interacting cognitive functions, fine-grained analysis of speech may have the potential to capture subtle cognitive deficits in the very early stages of AD. Here, we examined the association between amyloid-beta (Aβ) pathology and acoustic speech characteristics in a group of cognitively normal Dutch adults.

View Article and Find Full Text PDF

Background: Cochlear implantation is an effective method of auditory rehabilitation. Nevertheless, the results show individual variations depending on several factors.

Aim: To evaluate cochlear implantation results based on the APCEI profile (Acceptance, Perception, Comprehension, Oral Expression and Intelligibility) and audiometric results.

View Article and Find Full Text PDF

Treating Presbyphonia in 2024: A Scoping Review.

J Voice

December 2024

Department of Oto-Rhino-Laryngology and Head and Neck Surgery, AP-HM, Marseille, La Conception University Hospital, Aix-Marseille University, 147 Boulevard Baille, CEDEX 5, 13385 Marseille, France.

Presbyphonia is a multi-dimensional pathology. Therefore, its treatment should address its different affecting factors, including a global health management to fight geriatric frailty, improve overall physical strength, and limit medication side-effects. The specific therapies should address glottal gap closure and vocal folds' pliability.

View Article and Find Full Text PDF

Aggressive fibromatosis is a rare, benign proliferative disease with unknown aetiology and high recurrence rate. To date, there are only eight reported cases affecting the larynx. Four were managed with total laryngectomy, whilst spontaneous regression happened in one case.

View Article and Find Full Text PDF

Experiences with healthcare for unilateral vocal fold paralysis: A qualitative study of the patient's perspective.

J Commun Disord

December 2024

Department of Rehabilitation Sciences, Centre for Speech and Language Sciences, Ghent University, Corneel Heymanslaan 10, Ghent 9000, Belgium; Department of Speech-Language Pathology and Audiology, Faculty of Humanities, University of Pretoria, Pretoria, South Africa.

Objective: Unilateral vocal fold paralysis (UVFP) frequently causes severe dysphonia, which necessitates multidisciplinary treatment. Literature on outcomes of interventions has primarily focused on vocal fold motility or instrumental vocal outcomes, but the perspectives of patients about the treatment process have not yet been investigated. The purpose of the study was therefore to explore patient experiences with healthcare for UVFP.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!