During a conversation, we hear the sound of the talker as well as the intended message. Traditional models of speech perception posit that acoustic details of a talker's voice are not encoded with the message whereas more recent models propose that talker identity is automatically encoded. When shadowing speech, listeners often fail to detect a change in talker identity. The present study was designed to investigate whether talker changes would be detected when listeners are actively engaged in a normal conversation, and visual information about the speaker is absent. Participants were called on the phone, and during the conversation the experimenter was surreptitiously replaced by another talker. Participants rarely noticed the change. However, when explicitly monitoring for a change, detection increased. Voice memory tests suggested that participants remembered only coarse information about both voices, rather than fine details. This suggests that although listeners are capable of change detection, voice information is not continuously monitored at a fine-grain level of acoustic representation during natural conversation and is not automatically encoded. Conversational expectations may shape the way we direct attention to voice characteristics and perceive differences in voice.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1080/17470218.2011.570353 | DOI Listing |
Humans and other animals develop remarkable behavioral specializations for identifying, differentiating, and acting on classes of ecologically important signals. Ultimately, this expertise is flexible enough to support diverse perceptual judgments: a voice, for example, simultaneously conveys what a talker says as well as myriad cues about her identity and state. Mature perception across complex signals thus involves both discovering and learning regularities that best inform diverse perceptual judgments, and weighting this information flexibly as task demands change.
View Article and Find Full Text PDFJ Vis Exp
August 2024
Institute of Linguistics, Shanghai International Studies University; Key Laboratory of Language Science and Multilingual Artificial Intelligence, Shanghai International Studies University;
Recognizing familiar speakers from vocal streams is a fundamental aspect of human verbal communication. However, it remains unclear how listeners can still discern the speaker's identity in expressive speech. This study develops a memorization-based individual speaker identity recognition approach and an accompanying electroencephalogram (EEG) data analysis pipeline, which monitors how listeners recognize familiar speakers and tell unfamiliar ones apart.
View Article and Find Full Text PDFJASA Express Lett
June 2024
Department of Psychology, University of Toronto Scarborough, Toronto, Ontario,
Singing is socially important but constrains voice acoustics, potentially masking certain aspects of vocal identity. Little is known about how well listeners extract talker details from sung speech or identify talkers across the sung and spoken modalities. Here, listeners (n = 149) were trained to recognize sung or spoken voices and then tested on their identification of these voices in both modalities.
View Article and Find Full Text PDFFront Neurosci
May 2024
Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.
The prevalence of synthetic talking faces in both commercial and academic environments is increasing as the technology to generate them grows more powerful and available. While it has long been known that seeing the face of the talker improves human perception of speech-in-noise, recent studies have shown that synthetic talking faces generated by deep neural networks (DNNs) are also able to improve human perception of speech-in-noise. However, in previous studies the benefit provided by DNN synthetic faces was only about half that of real human talkers.
View Article and Find Full Text PDFCortex
July 2024
School of Foreign Languages, University of Electronic Science and Technology of China, Chengdu, China; The Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for NeuroInformation, University of Electronic Science and Technology of China, Chengdu, China. Electronic address:
Recognizing talkers' identity via speech is an important social skill in interpersonal interaction. Behavioral evidence has shown that listeners can identify better the voices of their native language than those of a non-native language, which is known as the language familiarity effect (LFE). However, its underlying neural mechanisms remain unclear.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!