Machine Learning-Based Estimation of Hoarseness Severity Using Acoustic Signals Recorded During High-Speed Videoendoscopy.

J Voice

Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany.

Published: January 2025

Objectives: This study investigates the use of sustained phonations recorded during high-speed videoendoscopy (HSV) for machine learning-based assessment of hoarseness severity (H). The performance of this approach is compared with conventional recordings obtained during voice therapy to evaluate key differences and limitations of HSV-derived acoustic recordings.

Methods: A database of 617 voice recordings with a duration of 250 ms was gathered during HSV examination (HS). Two databases comprising 809 vowels recorded during voice therapy were used for comparison, examining recording durations of 1 second (VT-1) and 250 ms (VT-2). A total of 490 features were extracted, including perturbation and noise characteristics, spectral and cepstral coefficients, as well as features based on modulation spectrum, nonlinear dynamic analysis, entropy, and empirical mode decomposition. Model development focused on selecting a minimal-optimal feature subset and suitable classification algorithms. Recordings were classified into two groups of hoarseness based on auditory-perceptual ratings by experts, yielding a continuous hoarseness score yˆ. Model performance was evaluated based on classification accuracy, correlation between predicted scores yˆ∈[0,1] and subjective ratings H∈{0,1,2,3}, and correlation between the relative change in quantitative and subjective ratings.

Results: Logistic regression combined with five acoustic features achieved a classification accuracy of 0.863 (VT-1), 0.847 (VT-2), and 0.742 (HS) on the test sets. A correlation of 0.797 (VT-1), 0.763 (VT-2), and 0.637 (HS) was obtained between yˆ and H, respectively. For 21 test subjects with two recordings, the model yielded a correlation of 0.592 (VT-1), 0.486 (VT-2), and 0.088 (HS) between ∆yˆ and ∆H.

Conclusion: While acoustic signals recorded during HSV show potential for quantitative hoarseness assessment, they are less reliable than voice therapy recordings due to practical challenges associated with oral laryngeal examination. Addressing these limitations, for example, through the use of flexible nasal endoscopy, could improve the quality of HSV-derived acoustic recordings and voice assessments.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jvoice.2024.12.008DOI Listing

Publication Analysis

Top Keywords

voice therapy
12
machine learning-based
8
hoarseness severity
8
acoustic signals
8
signals recorded
8
recorded high-speed
8
high-speed videoendoscopy
8
recordings voice
8
hsv-derived acoustic
8
classification accuracy
8

Similar Publications

Background: Physiotherapy has emerged as an important health strategy to deliver lifestyle, exercise and physical activity for people with knee osteoarthritis. However, little is known about the extent to which physiotherapists adhere to clinical practice guidelines. This study aimed to explore the perspectives and care patterns of physiotherapists in Sydney, Australia on their delivery of knee osteoarthritis care and how this aligns with the 2019 Osteoarthritis Research Society International (OARSI) guidelines.

View Article and Find Full Text PDF

While developments within psychoanalytic thought have expanded our understanding of the phenomenon of countertransference and its meaning, clinicians are often left with a sense that negative countertransference is a sign of a therapist's own "unfinished business." Within the context of clinical supervision, the model of putting countertransference reactions through The Therapist Translator is introduced as a means of exploring how to best give voice to and make use of the emotional responses that arise from within the intersubjective analytic field. In this article, the authors introduce a group process dynamic that utilizes the free association of multiple clinicians to assist in "translating" relatively unformulated material into conscious, linguistic information that can be incorporated into the treatment.

View Article and Find Full Text PDF

Purpose: The aim was to determine and compare the short-term effects of two intensive semi-occluded vocal tract (SOVT) programs, "straw phonation" (SP) and "resonant voice therapy" (RVT), on the phonation of children with vocal fold nodules.

Method: A pretest-posttest randomized controlled study design was used. Thirty children aged 6-12 years were randomly assigned to the SP group ( = 11), RVT group ( = 11), or control group receiving indirect treatment ( = 8) for their voice problems.

View Article and Find Full Text PDF

Machine Learning-Based Estimation of Hoarseness Severity Using Acoustic Signals Recorded During High-Speed Videoendoscopy.

J Voice

January 2025

Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany.

Objectives: This study investigates the use of sustained phonations recorded during high-speed videoendoscopy (HSV) for machine learning-based assessment of hoarseness severity (H). The performance of this approach is compared with conventional recordings obtained during voice therapy to evaluate key differences and limitations of HSV-derived acoustic recordings.

Methods: A database of 617 voice recordings with a duration of 250 ms was gathered during HSV examination (HS).

View Article and Find Full Text PDF

Amniotic Fluid as a Potential Treatment for Vocal Fold Scar in a Rabbit Model.

J Voice

January 2025

Department of Otolaryngology - Head and Neck Surgery, University of Utah, Salt Lake City, UT; Department of Surgery, University Utah, Salt Lake City, UT.

Objectives/hypothesis: Vocal fold (VF) injury and chronic inflammation can progress to scarring, which is notoriously difficult to treat. Human amniotic fluid (AF) has potential for VF wound healing in a rabbit model, and we hypothesized that AF would demonstrate wound healing properties superior to hyaluronic acid (HA) over time.

Study Design: Randomized, controlled trial.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!