In concurrent-speech recognition, performance is enhanced when either the glottal pulse rate (GPR) or the vocal tract length (VTL) of the target speaker differs from that of the distracter, but relatively little is known about the trading relationship between the two variables, or how they interact with other cues such as signal-to-noise ratio (SNR). This paper presents a study in which listeners were asked to identify a target syllable in the presence of a distracter syllable, with carefully matched temporal envelopes. The syllables varied in GPR and VTL over a large range, and they were presented at different SNRs. The results showed that performance is particularly sensitive to the combination of GPR and VTL when the SNR is 0 dB. Equal-performance contours showed that when there are no other cues, a two-semitone difference in GPR produced the same advantage in performance as a 20% difference in VTL. This corresponds to a trading relationship between GPR and VTL of 1.6. The results illustrate that the auditory system can use any combination of differences in GPR, VTL, and SNR to segregate competing speech signals.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2833416 | PMC |
http://dx.doi.org/10.1121/1.3050321 | DOI Listing |
J Acoust Soc Am
October 2018
International Laboratory for Brain, Music and Sound Research (BRAMS), Department of Psychology, Université de Montréal, Pavillon 1420 Boulevard Mont-Royal, Outremont, Quebec, H2V 4P3, Canada.
Cocktail parties pose a difficult yet solvable problem for the auditory system. Previous work has shown that the cocktail-party problem is considerably easier when all sounds in the target stream are spoken by the same talker (the ). The present study investigated the contributions of two of the most salient voice features-glottal-pulse rate (GPR) and vocal-tract length (VTL)-to the voice-continuity benefit.
View Article and Find Full Text PDFJ Acoust Soc Am
February 2009
Department of Physiology, Centre for the Neural Basis of Hearing, University of Cambridge, Cambridge, United Kingdom.
In concurrent-speech recognition, performance is enhanced when either the glottal pulse rate (GPR) or the vocal tract length (VTL) of the target speaker differs from that of the distracter, but relatively little is known about the trading relationship between the two variables, or how they interact with other cues such as signal-to-noise ratio (SNR). This paper presents a study in which listeners were asked to identify a target syllable in the presence of a distracter syllable, with carefully matched temporal envelopes. The syllables varied in GPR and VTL over a large range, and they were presented at different SNRs.
View Article and Find Full Text PDFJ Acoust Soc Am
December 2007
Centre for Neural Basis of Hearing, Department of Physiology, Development and Neuroscience, University of Cambridge and Department of Psychology, University of Hull, United Kingdom.
A recent study [Smith and Patterson, J. Acoust. Soc.
View Article and Find Full Text PDFNeuroimage
August 2006
Wellcome Department of Imaging Neuroscience, Institute of Neurology, University College London, Queen Square, London WC1N 3BG, UK.
The length of a vocal tract is reflected in the sound it is producing. The length of the vocal tract is correlated with body size and humans are very good at making size judgments based on the acoustic effect of vocal tract length only. Here we investigate the underlying mechanism for processing this main auditory cue to size information in the human brain.
View Article and Find Full Text PDFJ Acoust Soc Am
November 2005
Centre for Neural Basis of Hearing, Department of Physiology, University of Cambridge, Downing Street, Cambridge CB2 3EG, United Kingdom.
Glottal-pulse rate (GPR) and vocal-tract length (VTL) are related to the size, sex, and age of the speaker but it is not clear how the two factors combine to influence our perception of speaker size, sex, and age. This paper describes experiments designed to measure the effect of the interaction of GPR and VTL upon judgements of speaker size, sex, and age. Vowels were scaled to represent people with a wide range of GPRs and VTLs, including many well beyond the normal range of the population, and listeners were asked to judge the size and sex/age of the speaker.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!