Objective: Acoustic analysis of voice has the potential to expedite detection and diagnosis of voice disorders. Applying an image-based, neural-network approach to analyzing the acoustic signal may be an effective means for detecting and differentially diagnosing voice disorders. The purpose of this study is to provide a proof-of-concept that embedded data within human phonation can be accurately and efficiently decoded with deep learning neural network analysis to differentiate between normal and disordered voices.

Methods: Acoustic recordings from 10 vocally-healthy speakers, as well as 70 patients with one of seven voice disorders (n = 10 per diagnosis), were acquired from a clinical database. Acoustic signals were converted into spectrograms and used to train a convolutional neural network developed with the Keras library. The network architecture was trained separately for each of the seven diagnostic categories. Binary classification tasks (ie, to classify normal vs. disordered) were performed for each of the seven diagnostic categories. All models were validated using the 10-fold cross-validation technique.

Results: Binary classification averaged accuracies ranged from 58% to 90%. Models were most accurate in their classification of adductor spasmodic dysphonia, unilateral vocal fold paralysis, vocal fold polyp, polypoid corditis, and recurrent respiratory papillomatosis. Despite a small sample size, these findings are consistent with previously published data utilizing deep neural networks for classification of voice disorders.

Conclusion: Promising preliminary results support further study of deep neural networks for clinical detection and diagnosis of human voice disorders. Current models should be optimized with a larger sample size.

Levels Of Evidence: Level III.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6580072PMC
http://dx.doi.org/10.1002/lio2.259DOI Listing

Publication Analysis

Top Keywords

voice disorders
16
detection diagnosis
8
neural network
8
normal disordered
8
diagnostic categories
8
binary classification
8
vocal fold
8
deep neural
8
neural networks
8
voice
6

Similar Publications

Validation of the Singing Voice Handicap Index in Greek Singers: Normal and Voice-Disordered Participants.

J Voice

January 2025

Department of Speech and Language Therapy, School of Health Rehabilitation Sciences, University of Patras, Patras, Greece; A' ENT University Clinic, Medical School, National Kapodistreian University of Athens, Athens, Greece. Electronic address:

Objectives: The Singing Voice Handicap Index (SVHI) was culturally adapted and validated in Greek to examine the impacts of voice problems on a singer's everyday life.

Methods: The translated version was administered to 120 singers in total, along with the translated version of the Voice Handicap Index (VHI), a sort voice history questionnaire, two Self-Rating Dysphonia Severity Scales (SRDSSs), and two visual analog scales. A week after the original completion of the Greek version of SVHI, a second copy of the SVHI was administered to 50% of the participants.

View Article and Find Full Text PDF

: This study aimed to determine the positive predictive value of our NMD Suspicion Criteria in the diagnosis of NMDs. Other clinical factors routinely examined in our voice and swallowing examinations were also investigated to see if they had a significant association with the diagnosis of NMDs. : This study retrospectively investigated the medical charts of patients who visited our Voice and Swallowing outpatient clinic between 2013 and 2022.

View Article and Find Full Text PDF

Introduction: Straw phonation therapy, a form of semi-occluded vocal tract (SOVT) exercise, is commonly used to help treat various voice disorders. Although straw phonation therapy has been studied extensively for decades, the impact of straw depth on vocal function remains unexplored. This study aims to quantify the effects of various straw vocal tract insertion depths (VTID) into the vocal tract on common aerodynamic parameters such as phonation threshold pressure (PTP), phonation threshold flow (PTF), and phonation threshold power (PTW) in an ex vivo canine model.

View Article and Find Full Text PDF

Objectives: The objectives of this study were to compare the auditory perceptual voice quality in young and older adults who had no self-reported voice complaints and to investigate the relationship of voice quality with age and gender in older adults.

Study Design: This is a retrospective study.

Materials And Methods: This study included 114 participants.

View Article and Find Full Text PDF

Vocal Health in SLPs: Easier Said Than Done.

J Voice

January 2025

Graduate School, Department of Speech and Language Therapy, Anadolu University, Eskişehir, Türkiye. Electronic address:

Objectives: As professional voice users, speech and language pathologists (SLPs) follow vocal hygiene behaviors both in the rehabilitation of voice disorders and in preventive interventions to reduce the risk among healthy users. However, it is curious to what extent SLPs adhere to vocal hygiene and healthy vocal behaviors and how this affects vocal fatigue. This study aims to investigate the extent to which SLPs perform vocal hygiene behaviors, their levels of vocal hygiene, and vocal fatigue.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!