Being able to estimate speech intelligibility without the need for listening tests would confer great benefits for a wide range of speech processing applications. Many attempts have therefore been made to introduce an objective, and ideally referencefree measure for this purpose. Most works analyze speech intelligibility prediction (SIP) methods from a macroscopic point of view, averaging over longer time spans. This paper, in contrast, presents a theoretical framework for the microscopic evaluation of SIP methods. Within our framework, a Statistically estimated Accuracy based on Theory (StAT) is derived, which numerically quantifies the statistical limitations inherent in microscopic SIP. A state-of-the-art approach to microscopic SIP, namely, the use of automatic speech recognition (ASR) to directly predict listening test results, is evaluated within this framework. The practical results are in good agreement with the theory. As the final contribution, a fully blind DIscriminative Speech intelligibility Predictor (DISP) is introduced and is also evaluated within the StAT framework. It is shown that this novel, blind estimator can predict intelligibility as well as-and often even with better accuracy than-the non-blind ASR-based approach, and that its results are again in good agreement with its theoretically derived performance potential.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10065470 | PMC |
http://dx.doi.org/10.1109/taslp.2022.3184888 | DOI Listing |
Int J Audiol
January 2025
Department of Otorhinolaryngology and Head & Neck Surgery, Leiden University Medical Center, Leiden, Netherlands.
Objective: Measuring listening effort using pupillometry is challenging in cochlear implant (CI) users. We assess three validated speech tests (Matrix, LIST, and DIN) to identify the optimal speech material for measuring peak-pupil-dilation (PPD) in CI users as a function of signal-to-noise ratio (SNR).
Design: Speech tests were administered in quiet and two noisy conditions, namely at the speech recognition threshold (0 dB re SRT), i.
IEEE/ACM Trans Audio Speech Lang Process
February 2024
CRSS: Center for Robust Speech Systems; Cochlear Implant Processing Laboratory (CILab), Department of Electrical and Computer Engineering, University of Texas at Dallas, USA.
The presence of background noise or competing talkers is one of the main communication challenges for cochlear implant (CI) users in speech understanding in naturalistic spaces. These external factors distort the time-frequency (T-F) content including magnitude spectrum and phase of speech signals. While most existing speech enhancement (SE) solutions focus solely on enhancing the magnitude response, recent research highlights the importance of phase in perceptual speech quality.
View Article and Find Full Text PDFNatl J Maxillofac Surg
November 2024
Department of Oral and Maxillofacial Surgery, Sharad Pawar Dental College and Hospital, Wardha, Maharashtra, India.
Background: Lip carcinomas account for 23.6% to 30% of tumors affecting the oral cavity. Lip reconstruction aims to achieve a competent oral sphincter with a good cosmetic appearance while avoiding microstomia.
View Article and Find Full Text PDFPerspect ASHA Spec Interest Groups
December 2024
DeVault Otologic Research Laboratory, Department of Otolaryngology-Head and Neck Surgery, Indiana University School of Medicine, Indianapolis.
Purpose: Cochlear implants (CIs) have improved the quality of life for many children with severe-to-profound sensorineural hearing loss. Despite the reported CI benefits of improved speech recognition, speech intelligibility, and spoken language processing, large individual differences in speech and language outcomes are still consistently reported in the literature. The enormous variability in CI outcomes has made it challenging to predict which children may be at high risk for limited benefits and how potential risk factors can be improved with interventions.
View Article and Find Full Text PDFHear Res
January 2025
Institute of Sound and Vibration Research, University of Southampton, Southampton, United Kingdom.
The cortical tracking of the acoustic envelope is a phenomenon where the brain's electrical activity, as recorded by electroencephalography (EEG) signals, fluctuates in accordance with changes in stimulus intensity (the acoustic envelope of the stimulus). Understanding speech in a noisy background is a key challenge for people with hearing impairments. Speech stimuli are therefore more ecologically valid than clicks, tone pips, or speech tokens (e.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!