Binaural speech intelligibility in rooms is a complex process that is affected by many factors including room acoustics, hearing loss, and hearing aid (HA) signal processing. Intelligibility is evaluated in this paper for a simulated room combined with a simulated hearing aid. The test conditions comprise three spatial configurations of the speech and noise sources, simulated anechoic and concert hall acoustics, three amounts of multitalker babble interference, the hearing status of the listeners, and three degrees of simulated HA processing provided to compensate for the noise and/or hearing loss.
View Article and Find Full Text PDFWhen reproducing sounds over headphones, the simulated source can be externalized (i.e., perceived outside the head) or internalized (i.
View Article and Find Full Text PDFMany objective measurements have been proposed to evaluate sound reproduction, but it is often difficult to link measured differences with the differences perceived by listeners. In the literature, the best correlations with perception were obtained for measures involving an auditory model. The present study investigated simpler measurements to highlight the signal processing steps required to make the link with perception.
View Article and Find Full Text PDFReverberation can have a strong detrimental effect on speech intelligibility in noise. Two main monaural effects were studied here: the temporal smearing of the target speech, which makes the speech less understandable, and the temporal smearing of the noise, which reduces the opportunity for listening in the masker dips. These phenomena have been shown to affect normal-hearing (NH) listeners.
View Article and Find Full Text PDFThis study investigated the role of harmonic cancellation in the intelligibility of speech in "cocktail party" situations. While there is evidence that harmonic cancellation plays a role in the segregation of simple harmonic sounds based on fundamental frequency (F0), its utility for mixtures of speech containing non-stationary F0s and unvoiced segments is unclear. Here we focused on the energetic masking of speech targets caused by competing speech maskers.
View Article and Find Full Text PDFThe aim of this study was to extend the harmonic-cancellation model proposed by Prud'homme et al. [J. Acoust.
View Article and Find Full Text PDFThis study investigated the effect of hearing loss on binaural unmasking (BU) for the intelligibility of speech in noise. Speech reception thresholds (SRTs) were measured with normal-hearing (NH) listeners and older mildly hearing-impaired (HI) listeners while varying the presentation level of the stimuli, reverberation, modulation of the noise masker, and spatial separation of the speech and noise sources. On average across conditions, the NH listeners benefited more (by 0.
View Article and Find Full Text PDFThis study aimed at predicting individual differences in speech reception thresholds (SRTs) in the presence of symmetrically placed competing talkers for young listeners with sensorineural hearing loss. An existing binaural model incorporating the individual audiogram was revised to handle severe hearing losses by (a) taking as input the target speech level at SRT in a given condition and (b) introducing a floor in the model to limit extreme negative better-ear signal-to-noise ratios. The floor value was first set using SRTs measured with stationary and modulated noises.
View Article and Find Full Text PDFNeuroscientific and clinical studies on auditory perception often use headphones to limit sound interference. In these conditions, sounds are perceived as internalized because they lack the sound-attributes that normally occur with a sound produced from a point in space around the listener. Without the spatial attention mechanisms that occur with localized sounds, auditory functional assessments could thus be underestimated.
View Article and Find Full Text PDFOlder adults with hearing loss have greater difficulty recognizing target speech in multi-talker environments than young adults with normal hearing, especially when target and masker speech streams are perceptually similar. A difference in fundamental frequency (f0) contour depth is an effective stream segregation cue for young adults with normal hearing. This study examined whether older adults with varying degrees of sensorineural hearing loss are able to utilize differences in target/masker f0 contour depth to improve speech recognition in multi-talker listening.
View Article and Find Full Text PDFA binaural model predicting speech intelligibility in envelope-modulated noise for normal-hearing (NH) and hearing-impaired listeners is proposed. The study shows the importance of considering an internal noise with two components relying on the individual audiogram and the level of the external stimuli. The model was optimized and verified using speech reception thresholds previously measured in three experiments involving NH and hearing-impaired listeners and sharing common methods.
View Article and Find Full Text PDFThis work aims to predict speech intelligibility against harmonic maskers. Unlike noise maskers, harmonic maskers (including speech) have a harmonic structure that may allow for a release from masking based on fundamental frequency (F0). Mechanisms, such as spectral glimpsing and harmonic cancellation, have been proposed to explain F0 segregation, but their relative contributions and ability to predict behavioral data have not been explored.
View Article and Find Full Text PDFJ Acoust Soc Am
September 2020
The present study investigated whether the perception of virtual auditory distance is binaural, monaural, or both. Listeners evaluated the distance of a frontal source of pink noise simulated in a room via headphones. Experiment 1 was performed with eyes closed in a soundproof booth.
View Article and Find Full Text PDFSound externalization, or the perception that a sound source is outside of the head, is an intriguing phenomenon that has long interested psychoacousticians. While previous reviews are available, the past few decades have produced a substantial amount of new data.In this review, we aim to synthesize those data and to summarize advances in our understanding of the phenomenon.
View Article and Find Full Text PDFListeners with sensorineural hearing loss routinely experience less spatial release from masking (SRM) in speech mixtures than listeners with normal hearing. Hearing-impaired listeners have also been shown to have degraded temporal fine structure (TFS) sensitivity, a consequence of which is degraded access to interaural time differences (ITDs) contained in the TFS. Since these "binaural TFS" cues are critical for spatial hearing, it has been hypothesized that degraded binaural TFS sensitivity accounts for the limited SRM experienced by hearing-impaired listeners.
View Article and Find Full Text PDFWhen a target talker speaks in the presence of competing talkers, the listener must not only segregate the voices but also understand the target message based on a limited set of spectrotemporal regions ("glimpses") in which the target voice dominates the acoustic mixture. Here, the hypothesis that a broad audible bandwidth is more critical for these sparse representations of speech than it is for intact speech is tested. Listeners with normal hearing were presented with sentences that were either intact, or progressively "glimpsed" according to a competing two-talker masker presented at various levels.
View Article and Find Full Text PDFSounds presented over headphones are generally perceived as internalized, i.e., originating from a source inside the head.
View Article and Find Full Text PDFDifferences in spatial cues, including interaural time differences (ITDs), interaural level differences (ILDs) and spectral cues, can lead to stream segregation of alternating noise bursts. It is unknown how effective such cues are for streaming sounds with realistic spectro-temporal variations. In particular, it is not known whether the high-frequency spectral cues associated with elevation remain sufficiently robust under such conditions.
View Article and Find Full Text PDFDifferences in fundamental frequency (F0), modulations in the masker envelope, and differences in spatial location between a speech target and a masker can improve speech intelligibility in cocktail-party situations. These cues have been thoroughly investigated independently and associated with unmasking mechanisms: F0 segregation, temporal dip listening and spatial unmasking, respectively. Two experiments were conducted to examine whether F0 segregation interacts with spatial unmasking (experiment 1) or temporal modulations in the masker envelope (experiment 2) by measuring speech reception thresholds for a monotonized or an intonated voice against eight types of harmonic complex masker.
View Article and Find Full Text PDFDifferences in fundamental frequency (F0) between voiced sounds are known to be a strong cue for stream segregation. However, speech consists of both voiced and unvoiced sounds, and less is known about whether and how the unvoiced portions are segregated. This study measured listeners' ability to integrate or segregate sequences of consonant-vowel tokens, comprising a voiceless fricative and a vowel, as a function of the F0 difference between interleaved sequences of tokens.
View Article and Find Full Text PDFThe perceptual differences between the sound reproductions of headphones were investigated in a pair-wise comparison study. Two musical excerpts were reproduced over 21 headphones positioned on a mannequin and recorded. The recordings were then processed and reproduced over one set of headphones to listeners, who were asked to evaluate their perceived degree of dissimilarity.
View Article and Find Full Text PDFA difference in fundamental frequency (ΔF0) and a difference in spatial location (ΔSL) are two cues known to provide masking releases when multiple speakers talk at once in a room. We examined situations in which reverberation should have no effect on the mechanisms underlying the releases from energetic masking produced by these two cues. Speech reception thresholds using both unpredictable target sentences and the coordinate response measure followed a similar pattern.
View Article and Find Full Text PDFAdv Exp Med Biol
September 2016
The speech intelligibility index (SII) calculation is based on the assumption that the effective range of signal-to-noise ratio (SNR) regarding speech intelligibility is [- 15 dB; +15 dB]. In a specific frequency band, speech intelligibility would remain constant by varying the SNRs above + 15 dB or below - 15 dB. These assumptions were tested in four experiments measuring speech reception thresholds (SRTs) with a speech target and speech-spectrum noise, while attenuating target or noise above or below 1400 Hz, with different levels of attenuation in order to test different SNRs in the two bands.
View Article and Find Full Text PDF