What happens to an acoustic signal after it enters the mind of a listener? Previous work has demonstrated that listeners maintain intermediate representations over time. However, the internal structure of such representations-be they the acoustic-phonetic signal or more general information about the probability of possible categories-remains underspecified. We present two experiments using a novel speaker-adaptation paradigm aimed at uncovering the format of speech representations. We exposed adult listeners ( = 297) to a speaker whose utterances contained acoustically ambiguous information concerning phones (and thus words), and we manipulated the temporal availability of disambiguating cues via visually presented text (presented before or after each utterance). Results from a traditional phoneme-categorization task showed that listeners adapted to a modified acoustic distribution when disambiguating text was provided before but not after the audio. These results support the position that speech representations consist of activation over categories and are inconsistent with direct maintenance of the acoustic-phonetic signal.

Download full-text PDF

Source
http://dx.doi.org/10.1177/0956797620968787DOI Listing

Publication Analysis

Top Keywords

acoustic-phonetic signal
8
speech representations
8
hear immediacy
4
immediacy linguistic
4
linguistic computation
4
computation representation
4
representation speech
4
speech acoustic
4
acoustic signal
4
signal enters
4

Similar Publications

A growing number of studies in forensic voice comparison have explored how elements of phonetic analysis and automatic speaker recognition systems may be integrated for optimal speaker discrimination performance. However, few studies have investigated the evidential value of long-term speech features using forensically-relevant speech data. This paper reports an empirical validation study that assesses the evidential strength of the following long-term features: fundamental frequency (F0), formant distributions, laryngeal voice quality, mel-frequency cepstral coefficients (MFCCs), and combinations thereof.

View Article and Find Full Text PDF

Functional benefits of continuous vs. categorical listening strategies on the neural encoding and perception of noise-degraded speech.

Brain Res

December 2024

Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA; Program in Neuroscience, Indiana University, Bloomington, IN, USA; Cognitive Science Program, Indiana University, Bloomington, IN, USA. Electronic address:

Article Synopsis
  • Speech perception involves categorizing continuous acoustic signals into distinct categories, which helps listeners manage complex auditory information.
  • Listeners classified vowels along an acoustic continuum in both clear and noisy environments, revealing differences in their perceptual strategies, with visual analog scaling (VAS) enhancing flexibility.
  • Findings indicate that a more gradient approach to listening, characterized by smoother discrimination of speech sounds, improves performance in noise, supported by neural data showing stronger responses in areas linked to auditory processing.
View Article and Find Full Text PDF

How listeners weight a wide variety of information to interpret ambiguities in the speech signal is a question of interest in speech perception, particularly when understanding how listeners process speech in the context of phrases or sentences. Dominant views of cue use for language comprehension posit that listeners integrate multiple sources of information to interpret ambiguities in the speech signal. Here, we study how semantic context, sentence rate, and vowel length all influence identification of word-final stops.

View Article and Find Full Text PDF

Acoustic information in speech changes continuously, yet listeners form discrete perceptual categories to ease the demands of perception. Being a more continuous/gradient as opposed to a discrete/categorical listener may be further advantageous for understanding speech in noise by increasing perceptual flexibility and resolving ambiguity. The degree to which a listener's responses to a continuum of speech sounds are categorical versus continuous can be quantified using visual analog scaling (VAS) during speech labeling tasks.

View Article and Find Full Text PDF

Purpose: Talkers adapt their speech according to the demands of their listeners and the communicative context, enhancing the properties of the signal (pitch, intensity) and/or properties of the code (enhancement of phonemic contrasts). This study asked how mothers adapt their child-directed speech (CDS) in ways that might serve the immediate goals of increasing intelligibility, as well as long-term goals of supporting speech and language development in their children.

Method: Mothers ( = 28) participated in a real-time interactive speech production/perception paradigm, in which mothers instructed their young (3- to 5-year-old) children, or an adult listener, to select the picture corresponding to a target word.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!