Temporal recalibration during asynchronous audiovisual speech perception.

Argiro Vatakis Jordi Navarra Salvador Soto-Faraco Charles Spence

Exp Brain Res

Crossmodal Research Laboratory, Department of Experimental Psychology, University of Oxford, South Parks Road, Oxford, OX1 3UD, UK.

Published: July 2007

The study examined how listening to an asynchronous speech stream influences people's ability to perceive the order of sounds and visuals in VCV (vowel-consonant-vowel) speech video clips.
Participants judged whether the sound or visual gesture occurred first while some were also exposed to an additional asynchronous word stream.
Results showed that focusing on the asynchronous speech caused a noticeable shift in participants' judgments, suggesting that adapting to delayed audio can alter the perception of more intricate audiovisual speech.

We investigated the consequences of monitoring an asynchronous audiovisual speech stream on the temporal perception of simultaneously presented vowel-consonant-vowel (VCV) audiovisual speech video clips. Participants made temporal order judgments (TOJs) regarding whether the speech-sound or the visual-speech gesture occurred first, for video clips presented at various different stimulus onset asynchronies. Throughout the experiment, half of the participants also monitored a continuous stream of words presented audiovisually, superimposed over the VCV video clips. The continuous (adapting) speech stream could either be presented in synchrony, or else with the auditory stream lagging by 300 ms. A significant shift (13 ms in the direction of the adapting stimulus in the point of subjective simultaneity) was observed in the TOJ task when participants monitored the asynchronous speech stream. This result suggests that the consequences of adapting to asynchronous speech extends beyond the case of simple audiovisual stimuli (as has recently been demonstrated by Navarra et al. in Cogn Brain Res 25:499-507, 2005) and can even affect the perception of more complex speech stimuli.

Download full-text PDF	Source
http://dx.doi.org/10.1007/s00221-007-0918-z	DOI Listing

Publication Analysis

Top Keywords

audiovisual speech

speech stream

video clips

asynchronous audiovisual

participants monitored

stream presented

asynchronous speech

speech

stream

temporal recalibration

Similar Publications

Multisensory naturalistic decoding with high-density diffuse optical tomography.

Neurophotonics

January 2025

Washington University School of Medicine, Mallinckrodt Institute of Radiology, St. Louis, Missouri, United States.

Kalyan Tripathy Zachary E Markow Morgan Fogarty Mariel L Schroeder Alexa M Svoboda

Significance: Decoding naturalistic content from brain activity has important neuroscience and clinical implications. Information about visual scenes and intelligible speech has been decoded from cortical activity using functional magnetic resonance imaging (fMRI) and electrocorticography, but widespread applications are limited by the logistics of these technologies.

Aim: High-density diffuse optical tomography (HD-DOT) offers image quality approaching that of fMRI but with the silent, open scanning environment afforded by optical methods, thus opening the door to more naturalistic research and applications.

View Article and Find Full Text PDF

Similar Publications

The ECOLANG Multimodal Corpus of adult-child and adult-adult Language.

Sci Data

January 2025

Experimental Psychology, University College London, London, United Kingdom.

Yan Gu Ed Donnellan Beata Grzyb Gwen Brekelmans Margherita Murgiano

Communication comprises a wealth of multimodal signals (e.g., gestures, eye gaze, intonation) in addition to speech and there is a growing interest in the study of multimodal language by psychologists, linguists, neuroscientists and computer scientists.

View Article and Find Full Text PDF

Similar Publications

Neural speech tracking contribution of lip movements predicts behavioral deterioration when the speaker's mouth is occluded.

eNeuro

January 2025

Paris-Lodron-University of Salzburg, Department of Psychology, Centre for Cognitive Neuroscience, Salzburg, Austria

Patrick Reisinger Marlies Gillis Nina Suess Jonas Vanthornhout Chandra Leon Haider

Observing lip movements of a speaker facilitates speech understanding, especially in challenging listening situations. Converging evidence from neuroscientific studies shows stronger neural responses to audiovisual stimuli compared to audio-only stimuli. However, the interindividual variability of this contribution of lip movement information and its consequences on behavior are unknown.

View Article and Find Full Text PDF

Similar Publications

Individual Differences in the Recognition of Spectrally Degraded Speech: Associations With Neurocognitive Functions in Adult Cochlear Implant Users and With Noise-Vocoded Simulations.

Trends Hear

January 2025

Department of Otolaryngology - Head & Neck Surgery, Vanderbilt University Medical Center, Nashville, TN, USA.

Aaron C Moberly Liping Du Terrin N Tamati

When listening to speech under adverse conditions, listeners compensate using neurocognitive resources. A clinically relevant form of adverse listening is listening through a cochlear implant (CI), which provides a spectrally degraded signal. CI listening is often simulated through noise-vocoding.

View Article and Find Full Text PDF

Similar Publications

Impact of High- and Low-Pass Acoustic Filtering on Audiovisual Speech Redundancy and Benefit in Children.

Ear Hear

December 2024

Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska, USA.

Kaylah Lalonde Grace Dwyer Adam Bosen Abby Pitts

Objectives: To investigate the influence of frequency-specific audibility on audiovisual benefit in children, this study examined the impact of high- and low-pass acoustic filtering on auditory-only and audiovisual word and sentence recognition in children with typical hearing. Previous studies show that visual speech provides greater access to consonant place of articulation than other consonant features and that low-pass filtering has a strong impact on perception on acoustic consonant place of articulation. This suggests visual speech may be particularly useful when acoustic speech is low-pass filtered because it provides complementary information about consonant place of articulation.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!