Microscopic and Blind Prediction of Speech Intelligibility: Theory and Practice.

IEEE/ACM Trans Audio Speech Lang Process

Cognitive signal processing group, Electrical engineering department, Ruhr-Universität Bochum, Universitätsstraße 150, 44801 Bochum, NRW, Germany.

Published: June 2022

Being able to estimate speech intelligibility without the need for listening tests would confer great benefits for a wide range of speech processing applications. Many attempts have therefore been made to introduce an objective, and ideally referencefree measure for this purpose. Most works analyze speech intelligibility prediction (SIP) methods from a macroscopic point of view, averaging over longer time spans. This paper, in contrast, presents a theoretical framework for the microscopic evaluation of SIP methods. Within our framework, a Statistically estimated Accuracy based on Theory (StAT) is derived, which numerically quantifies the statistical limitations inherent in microscopic SIP. A state-of-the-art approach to microscopic SIP, namely, the use of automatic speech recognition (ASR) to directly predict listening test results, is evaluated within this framework. The practical results are in good agreement with the theory. As the final contribution, a fully blind DIscriminative Speech intelligibility Predictor (DISP) is introduced and is also evaluated within the StAT framework. It is shown that this novel, blind estimator can predict intelligibility as well as-and often even with better accuracy than-the non-blind ASR-based approach, and that its results are again in good agreement with its theoretically derived performance potential.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10065470PMC
http://dx.doi.org/10.1109/taslp.2022.3184888DOI Listing

Publication Analysis

Top Keywords

speech intelligibility
16
sip methods
8
microscopic sip
8
good agreement
8
speech
6
intelligibility
5
microscopic
4
microscopic blind
4
blind prediction
4
prediction speech
4

Similar Publications

Objective: Measuring listening effort using pupillometry is challenging in cochlear implant (CI) users. We assess three validated speech tests (Matrix, LIST, and DIN) to identify the optimal speech material for measuring peak-pupil-dilation (PPD) in CI users as a function of signal-to-noise ratio (SNR).

Design: Speech tests were administered in quiet and two noisy conditions, namely at the speech recognition threshold (0 dB re SRT), i.

View Article and Find Full Text PDF

Speech Enhancement for Cochlear Implant Recipients using Deep Complex Convolution Transformer with Frequency Transformation.

IEEE/ACM Trans Audio Speech Lang Process

February 2024

CRSS: Center for Robust Speech Systems; Cochlear Implant Processing Laboratory (CILab), Department of Electrical and Computer Engineering, University of Texas at Dallas, USA.

The presence of background noise or competing talkers is one of the main communication challenges for cochlear implant (CI) users in speech understanding in naturalistic spaces. These external factors distort the time-frequency (T-F) content including magnitude spectrum and phase of speech signals. While most existing speech enhancement (SE) solutions focus solely on enhancing the magnitude response, recent research highlights the importance of phase in perceptual speech quality.

View Article and Find Full Text PDF

Background: Lip carcinomas account for 23.6% to 30% of tumors affecting the oral cavity. Lip reconstruction aims to achieve a competent oral sphincter with a good cosmetic appearance while avoiding microstomia.

View Article and Find Full Text PDF

Some Challenging Questions About Outcomes in Children With Cochlear Implants.

Perspect ASHA Spec Interest Groups

December 2024

DeVault Otologic Research Laboratory, Department of Otolaryngology-Head and Neck Surgery, Indiana University School of Medicine, Indianapolis.

Purpose: Cochlear implants (CIs) have improved the quality of life for many children with severe-to-profound sensorineural hearing loss. Despite the reported CI benefits of improved speech recognition, speech intelligibility, and spoken language processing, large individual differences in speech and language outcomes are still consistently reported in the literature. The enormous variability in CI outcomes has made it challenging to predict which children may be at high risk for limited benefits and how potential risk factors can be improved with interventions.

View Article and Find Full Text PDF

The cortical tracking of the acoustic envelope is a phenomenon where the brain's electrical activity, as recorded by electroencephalography (EEG) signals, fluctuates in accordance with changes in stimulus intensity (the acoustic envelope of the stimulus). Understanding speech in a noisy background is a key challenge for people with hearing impairments. Speech stimuli are therefore more ecologically valid than clicks, tone pips, or speech tokens (e.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!