Comparison of Acoustic Voice Features Derived From Mobile Devices and Studio Microphone Recordings.

J Voice

School of Electrical and Electronic Engineering, University College Dublin, Dublin, Ireland; Insight Centre for Data Analytics, University College Dublin, Dublin, Ireland.

Published: November 2022

Objectives/hypothesis: Improvements in mobile device technology offer new opportunities for remote monitoring of voice for home and clinical assessment. However, there is a need to establish equivalence between features derived from signals recorded from mobile devices and gold standard microphone-preamplifiers. In this study acoustic voice features from android smartphone, tablet, and microphone-preamplifier recordings were compared.

Methods: Data were recorded from 37 volunteers (20 female) with no history of speech disorder and six volunteers with Huntington's disease (HD) during sustained vowel (SV) phonation, reading passage (RP), and five syllable repetition (SR) tasks. The following features were estimated: fundamental frequency median and standard deviation (F0 and SD F0), harmonics-to-noise ratio (HNR), local jitter, relative average perturbation of jitter (RAP), five-point period perturbation quotient (PPQ5), difference of differences of amplitude and periods (DDA and DDP), shimmer, and amplitude perturbation quotients (APQ3, APQ5, and APQ11).

Results: Bland-Altman analysis revealed good agreement between microphone and mobile devices for fundamental frequency, jitter, RAP, PPQ5, and DDP during all tasks and a bias for HNR, shimmer and its variants (APQ3, APQ5, APQ11, and DDA). Significant differences were observed between devices for HNR, shimmer, and its variants for all tasks. High correlation was observed between devices for all features, except SD F0 for RP. Similar results were observed in the HD group for SV and SR task. Biological sex had a significant effect on F0 and HNR during all tests, and for jitter, RAP, PPQ5, DDP, and shimmer for RP and SR. No significant effect of age was observed.

Conclusions: Mobile devices provided good agreement with state of the art, high-quality microphones during structured speech tasks for features derived from frequency components of the audio recordings. Caution should be taken when estimating HNR, shimmer and its variants from recordings made with mobile devices.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jvoice.2022.10.006DOI Listing

Publication Analysis

Top Keywords

mobile devices
20
features derived
12
jitter rap
12
hnr shimmer
12
shimmer variants
12
acoustic voice
8
voice features
8
tasks features
8
fundamental frequency
8
ddp shimmer
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!