Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain.

J Acoust Soc Am

Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.

Published: October 2016

Unlabelled: A speech intelligibility prediction model is proposed that combines the auditory processing front end of the multi-resolution speech-based envelope power spectrum model [mr-sEPSM; Jørgensen, Ewert, and Dau (2013). J. Acoust. Soc. Am. 134(1), 436-446] with a correlation back end inspired by the short-time objective intelligibility measure [STOI; Taal, Hendriks, Heusdens, and Jensen (2011). IEEE Trans. Audio Speech Lang.

Process: 19(7), 2125-2136]. This "hybrid" model, named sEPSM, is shown to account for the effects of stationary and fluctuating additive interferers as well as for the effects of non-linear distortions, such as spectral subtraction, phase jitter, and ideal time frequency segregation (ITFS). The model shows a broader predictive range than both the original mr-sEPSM (which fails in the phase-jitter and ITFS conditions) and STOI (which fails to predict the influence of fluctuating interferers), albeit with lower accuracy than the source models in some individual conditions. Similar to other models that employ a short-term correlation-based back end, including STOI, the proposed model fails to account for the effects of room reverberation on speech intelligibility. Overall, the model might be valuable for evaluating the effects of a large range of interferers and distortions on speech intelligibility, including consequences of hearing impairment and hearing-instrument signal processing.

Download full-text PDF

Source
http://dx.doi.org/10.1121/1.4964505DOI Listing

Publication Analysis

Top Keywords

speech intelligibility
16
envelope power
8
power spectrum
8
account effects
8
model
6
intelligibility
5
predicting speech
4
intelligibility based
4
based correlation
4
correlation metric
4

Similar Publications

: Currently, there is a lack of a comprehensive classification system for soft-palate defects that provides synthetic information to guide functional reconstructive treatment. Our awareness, shaped by extensive experience, of the superiority of myomucosal flaps to fasciocutaneous flaps in functional palate reconstruction has driven us to introduce a new defect-based classification system and propose a new algorithm for reconstructing soft-palate defects using buccinator myomucosal flaps. : Soft-palate defects were classified into five classes.

View Article and Find Full Text PDF

Intelligibility Sound Therapy Enhances the Ability of Speech-in-Noise Perception and Pre-Perceptual Neurophysiological Response.

Biology (Basel)

December 2024

Department of Otorhinolaryngology, Head and Neck Surgery, Graduate School of Biomedical Sciences, Hiroshima University, Kasumi 1-2-3, Minami-ku, Hiroshima 734-8551, Japan.

Aural rehabilitation with hearing aids can decrease the attentional requirements of cognitive resources by amplifying deteriorated-frequency sound in hearing loss patients and improving auditory discrimination ability like speech-in-noise perception. As aural rehabilitation with an intelligible-hearing sound also can be hopeful, the aim of this study was to evaluate the effectiveness of aural rehabilitation with intelligible-hearing sound for hearing loss patients. Adult native Japanese speakers (17 males and 23 females, 68.

View Article and Find Full Text PDF

Objectives: The SARS-CoV-2 pandemic required the use of personal protective equipment (PPE) in medical and social contexts to reduce exposure and prevent pathogen transmission. This study aims to analyse possible changes in voice and speech parameters with and without PPE.

Methods: Speech samples using different types of PPE were obtained.

View Article and Find Full Text PDF

Prosodic Modifications to Challenging Communicative Environments in Preschoolers.

Lang Speech

January 2025

Department of Educational Psychology, Leadership, & Counseling, Texas Tech University, USA.

Adapting one's speaking style is particularly crucial as children start interacting with diverse conversational partners in various communication contexts. The study investigated the capacity of preschool children aged 3-5 years ( = 28) to modify their speaking styles in response to background noise, referred to as noise-adapted speech, and when talking to an interlocutor who pretended to have hearing loss, referred to as clear speech. We examined how two modified speaking styles differed across the age range.

View Article and Find Full Text PDF

Objectives: This study examined the relationships between electrophysiological measures of the electrically evoked auditory brainstem response (EABR) with speech perception measured in quiet after cochlear implantation (CI) to identify the ability of EABR to predict postoperative CI outcomes.

Methods: Thirty-four patients with congenital prelingual hearing loss, implanted with the same manufacturer's CI, were recruited. In each participant, the EABR was evoked at apical, middle, and basal electrode locations.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!