Objective: To analyze the performance of 10 different machine learning (ML) classifiers for discrimination between dysphonic and non-dysphonic voices, using a variance threshold as a method for the selection and reduction of acoustic measurements used in the classifier.
Method: We analyzed 435 samples of individuals (337 female and 98 male), with a mean age of 41.07 ± 13.73 years, of which 384 were dysphonic and 51 were non-dysphonic. From the sustained /ε/ vowel sample, 34 acoustic measurements were extracted, including traditional perturbation and noise measurements, cepstral/spectral measurements, and measurements based on nonlinear models. The variance method was used to select the best set of acoustic measurements. We tested the performance of the best-selected set with 10 ML classifiers using precision, sensitivity, specificity, accuracy, and F1-Score measurements. The kappa coefficient was used to verify the reproducibility between the two datasets (training and testing).
Results: The naive Bayes (NB) and stochastic gradient descent classifier (SGDC) models performed best in terms of accuracy, AUC, sensitivity, and specificity for a reduced dataset of 15 acoustic measures compared to the full dataset of 34 acoustic measures. SGDC and NB obtained the best performance results, with an accuracy of 0.91 and 0.76, respectively. These two classifiers presented moderate agreement, with a Kappa of 0.57 (SGDC) and 0.45 (NB).
Conclusion: Among the tested models, the NB and SGDC models performed better in discriminating between dysphonic and non-dysphonic voices from a set of 15 acoustic measures.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.jvoice.2022.11.001 | DOI Listing |
J Voice
January 2025
Department of Otolaryngology-Head and Neck Surgery, Boston Medical Center, Boston, MA; Boston University Chobanian and Avedisian School of Medicine, Boston, MA. Electronic address:
Introduction: Patient-reported outcome measures (PROMs) represent an important part of a comprehensive voice assessment for clinical care and research. Access to multilingual PROMs enables inclusion of information from diverse patient populations. This review compares available translated and validated PROMs for adult dysphonia.
View Article and Find Full Text PDFJ Voice
November 2024
School of Engineering Mathematics and Technology, University of Bristol, Bristol, United Kingdom.
Objective: To develop a multiparametric index based on machine learning (ML) to predict and classify the overall degree of vocal deviation (GG).
Method: The sample consisted of 300 dysphonic and non-dysphonic participants of both sexes. Two speech tasks were sustained vowel [a] and connected speech (counting numbers from 1 to 10).
Cureus
October 2024
Otolaryngology-Head and Neck Surgery Universitary Center, Hospital Universitario "Dr. José Eleuterio González", Monterrey, MEX.
Background Acoustic vocal analysis provides objective and measurable values for various voice parameters, such as fundamental frequency (F0), shimmer, jitter, and the noise-to-harmony ratio (NHR). In severely dysphonic patients, who present increased variability in glottic cycles and abnormalities in vocal intensity, conventional acoustic analysis is an unreliable predictor of dysphonia. The logarithmic transformation of the vocal spectrum (cepstrum) allows capturing the signal without relying on recording technique, frequency, or vocal intensity.
View Article and Find Full Text PDFFolia Phoniatr Logop
September 2024
Department of Communication Sciences, Universidad de los Andes, Santiago, Chile.
Introduction: The present study aimed to validate the Voice-Related Quality of Life (V-RQOL), vocal self-assessment questionnaire for Spanish.
Methods: The validation and psychometric properties were developed according to the criteria of the Scientific Advisory Committee of Medical Outcomes Trust (SAC). The Spanish translation for linguistic and cultural adaptation of the V-RQOL was used.
Laryngoscope
December 2024
Division of Pediatric Otolaryngology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, U.S.A.
Objectives: Posterior glottic diastasis (PGD) is an underappreciated etiology of dysphonia in patients with prior airway reconstruction or prolonged intubation. In endoscopic posterior cricoid reduction (ePCR), cricoid is removed to minimize the posterior glottic gap. Dynamic voice computed tomography (DVCT) permits visualization of the posterior glottis, estimating the amount of cricoid to be removed.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!