Lip-reading aids word recognition most in moderate noise: a Bayesian explanation using high-dimensional feature space.

Wei Ji Ma Xiang Zhou Lars A Ross John J Foxe Lucas C Parra

PLoS One

Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America.

Published: April 2009

Watching facial movements greatly helps us understand speech, especially when there's background noise.
Unlike expected, visual cues work better at medium noise levels rather than high ones, and this is supported by a Bayesian model explaining how we integrate sensory information.
The model also predicts that as sound clarity improves, our ability to identify words tied to visual cues first increases but then starts to decrease, which was confirmed through experiments.

Watching a speaker's facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness), one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2645675	PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0004638	PLOS

Publication Analysis

Top Keywords

levels auditory

auditory noise

word recognition

feature space

inverse effectiveness

highest levels

visual stimulus

bayesian model

noise

visual

Similar Publications

Infants' resting-state functional connectivity and event-related potentials: A multimodal approach to investigating the neural basis of infant novelty detection.

Dev Psychol

January 2025

Department of Child and Adolescent Psychiatry, New York University Grossman School of Medicine.

Dana Kanel Santiago Morales Kathryn Altman John E Richards Anderson M Winkler

Individual differences in how the brain responds to novelty are present from infancy. A common method of studying novelty processing is through event-related potentials (ERPs). While ERPs possess millisecond precision, spatial resolution remains poor, especially in infancy.

View Article and Find Full Text PDF

Similar Publications

Extended homogeneous field correction method based on oblique projection in OPM-MEG.

Neuroimage

January 2025

Key Laboratory of Ultra-Weak Magnetic Field Measurement Technology, Ministry of Education, School of Instrumentation and Optoelectronic Engineering, Beihang University, 100191, Beijing, China; Hangzhou Institute of Extremely-Weak Magnetic Field Major National Science and Technology Infrastructure, Hangzhou, 310051, China; State Key Laboratory of Traditional Chinese Medicine Syndrome/Health Construction Center, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, 510120, China; Hefei National Laboratory, Hefei, 230088, China. Electronic address:

Fulong Wang Fuzhi Cao Yujie Ma Ruochen Zhao Ruonan Wang

Optically pumped magnetometer-based magnetoencephalography (OPM-MEG) is an novel non-invasive functional imaging technique that features more flexible sensor configurations and wearability; however, this also increases the requirement for environmental noise suppression. Subspace projection algorithms are widely used in MEG to suppress noise. However, in OPM-MEG systems with a limited number of channels, subspace projection methods that rely on spatial oversampling exhibit reduced performance.

View Article and Find Full Text PDF

Similar Publications

Genetic and audiological determinants of hearing loss in high-risk neonates.

Braz J Otorhinolaryngol

January 2025

Shanghai Jiao Tong University, School of Medicine, Hainan Branch of Shanghai Children's Medical Center, Department of Otorhinolaryngology, Sanya, China; Shanghai Jiao Tong University, School of Medicine, Shanghai Children's Medical Center, Department of Otorhinolaryngology, Shanghai, China. Electronic address:

Yanan Shi Naiyao Zhang Na Du Tongxi Zheng Ying Yu

Objective: We aimed to investigate the correlation between prevalent risk factors for high-risk neonates in neonatal intensive care unit and their hearing loss, and to examine the audiological features and genetic profiles associated with different deafness mutations in our tertiary referral center. This research seeks to deepen our understanding of the etiology behind congenital hearing loss.

Methods: We conducted initial hearing screenings, including automated auditory brainstem response, distortion product otoacoustic emission, and acoustic immittance on 443 high-risk neonates within 7 days after birth and 42 days (if necessary) after birth.

View Article and Find Full Text PDF

Similar Publications

Impacts of hazardous noise levels on hearing loss and tinnitus in dental professionals.

J Occup Med Toxicol

January 2025

School of Health Sciences, Department of Audiology, University of the Pacific, San Francisco, California, USA.

Celia Zhang Andrew Young Samantha Rodriguez Karen A Schulze Bina Surti

Background: Hazardous noise exposure is an important health concern in many workplaces and is one of the most common work-related injuries in the United States. Dental professionals are frequently exposed to high levels of occupational noise in their daily work environment. This noise is generated by various dental handpieces such as drills, suctions, and ultrasonic scalers.

View Article and Find Full Text PDF

Similar Publications

Effect of sign language learning on temporal resolution of visual attention.

J Vis

January 2025

Department of Communicative Disorders, University of Alabama, Tuscaloosa, AL, USA.

Serpil Karabüklü Sandra Wood Chuck Bradley Ronnie B Wilbur Evie A Malaia

The visual environment of sign language users is markedly distinct in its spatiotemporal parameters compared to that of non-signers. Although the importance of temporal and spectral resolution in the auditory modality for language development is well established, the spectrotemporal parameters of visual attention necessary for sign language comprehension remain less understood. This study investigates visual temporal resolution in learners of American Sign Language (ASL) at various stages of acquisition to determine how experience with sign language affects perceptual sampling.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!