What happens to an acoustic signal after it enters the mind of a listener? Previous work has demonstrated that listeners maintain intermediate representations over time. However, the internal structure of such representations-be they the acoustic-phonetic signal or more general information about the probability of possible categories-remains underspecified. We present two experiments using a novel speaker-adaptation paradigm aimed at uncovering the format of speech representations. We exposed adult listeners ( = 297) to a speaker whose utterances contained acoustically ambiguous information concerning phones (and thus words), and we manipulated the temporal availability of disambiguating cues via visually presented text (presented before or after each utterance). Results from a traditional phoneme-categorization task showed that listeners adapted to a modified acoustic distribution when disambiguating text was provided before but not after the audio. These results support the position that speech representations consist of activation over categories and are inconsistent with direct maintenance of the acoustic-phonetic signal.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1177/0956797620968787 | DOI Listing |
Forensic Sci Int
October 2024
Department of English and Communication, Hong Kong Polytechnic University, Hong Kong. Electronic address:
A growing number of studies in forensic voice comparison have explored how elements of phonetic analysis and automatic speaker recognition systems may be integrated for optimal speaker discrimination performance. However, few studies have investigated the evidential value of long-term speech features using forensically-relevant speech data. This paper reports an empirical validation study that assesses the evidential strength of the following long-term features: fundamental frequency (F0), formant distributions, laryngeal voice quality, mel-frequency cepstral coefficients (MFCCs), and combinations thereof.
View Article and Find Full Text PDFBrain Res
December 2024
Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA; Program in Neuroscience, Indiana University, Bloomington, IN, USA; Cognitive Science Program, Indiana University, Bloomington, IN, USA. Electronic address:
Atten Percept Psychophys
July 2024
University of Connecticut, Storrs, CT, USA.
How listeners weight a wide variety of information to interpret ambiguities in the speech signal is a question of interest in speech perception, particularly when understanding how listeners process speech in the context of phrases or sentences. Dominant views of cue use for language comprehension posit that listeners integrate multiple sources of information to interpret ambiguities in the speech signal. Here, we study how semantic context, sentence rate, and vowel length all influence identification of word-final stops.
View Article and Find Full Text PDFAcoustic information in speech changes continuously, yet listeners form discrete perceptual categories to ease the demands of perception. Being a more continuous/gradient as opposed to a discrete/categorical listener may be further advantageous for understanding speech in noise by increasing perceptual flexibility and resolving ambiguity. The degree to which a listener's responses to a continuum of speech sounds are categorical versus continuous can be quantified using visual analog scaling (VAS) during speech labeling tasks.
View Article and Find Full Text PDFJ Speech Lang Hear Res
January 2024
Department of Psychological and Brain Sciences, University of Iowa, Iowa City.
Purpose: Talkers adapt their speech according to the demands of their listeners and the communicative context, enhancing the properties of the signal (pitch, intensity) and/or properties of the code (enhancement of phonemic contrasts). This study asked how mothers adapt their child-directed speech (CDS) in ways that might serve the immediate goals of increasing intelligibility, as well as long-term goals of supporting speech and language development in their children.
Method: Mothers ( = 28) participated in a real-time interactive speech production/perception paradigm, in which mothers instructed their young (3- to 5-year-old) children, or an adult listener, to select the picture corresponding to a target word.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!