Objectives: In noisy environments, listeners benefit from both hearing and seeing a talker, demonstrating audiovisual (AV) cues enhance speech-in-noise (SIN) recognition. Here, we examined the relative contribution of auditory and visual cues to SIN perception and the strategies used by listeners to decipher speech in noise interference(s).

Design: Normal-hearing listeners (n = 22) performed an open-set speech recognition task while viewing audiovisual TIMIT sentences presented under different combinations of signal degradation including visual (AVn), audio (AnV), or multimodal (AnVn) noise. Acoustic and visual noises were matched in physical signal-to-noise ratio. Eyetracking monitored participants' gaze to different parts of a talker's face during SIN perception.

Results: As expected, behavioral performance for clean sentence recognition was better for A-only and AV compared to V-only speech. Similarly, with noise in the auditory channel (AnV and AnVn speech), performance was aided by the addition of visual cues of the talker regardless of whether the visual channel contained noise, confirming a multimodal benefit to SIN recognition. The addition of visual noise (AVn) obscuring the talker's face had little effect on speech recognition by itself. Listeners' eye gaze fixations were biased toward the eyes (decreased at the mouth) whenever the auditory channel was compromised. Fixating on the eyes was negatively associated with SIN recognition performance. Eye gazes on the mouth versus eyes of the face also depended on the gender of the talker.

Conclusions: Collectively, results suggest listeners (1) depend heavily on the auditory over visual channel when seeing and hearing speech and (2) alter their visual strategy from viewing the mouth to viewing the eyes of a talker with signal degradations, which negatively affects speech perception.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6939137PMC
http://dx.doi.org/10.1097/AUD.0000000000000755DOI Listing

Publication Analysis

Top Keywords

speech recognition
12
sin recognition
12
speech
8
visual
8
auditory visual
8
visual cues
8
speech noise
8
talker's face
8
auditory channel
8
addition visual
8

Similar Publications

An End-To-End Speech Recognition Model for the North Shaanxi Dialect: Design and Evaluation.

Sensors (Basel)

January 2025

SHCCIG Yubei Coal Industry Co., Ltd., Xi'an 710900, China.

The coal mining industry in Northern Shaanxi is robust, with a prevalent use of the local dialect, known as "Shapu", characterized by a distinct Northern Shaanxi accent. This study addresses the practical need for speech recognition in this dialect. We propose an end-to-end speech recognition model for the North Shaanxi dialect, leveraging the Conformer architecture.

View Article and Find Full Text PDF

Efficacy of Hearing Aids in Patients with Hearing Difficulties in Noise: Focus on Hidden Hearing Loss.

J Clin Med

January 2025

Assistance Publique-Hôpitaux de Paris, Hôpital Bicêtre, Service d'Oto-Rhino-Laryngologie, 78 Rue du Général Leclerc, 94270 Le Kremlin-Bicêtre, France.

Hearing aids (HAs) have been used for standard high-frequency hearing loss and tinnitus, but their effects on speech intelligibility in noise (SIN) in people with normal hearing, including hidden hearing loss (HHL), have been little explored. We included in a prospective cohort study patients who experience poor SIN and have normal pure tone average in quiet conditions or slight HL. We used open-fit HAs.

View Article and Find Full Text PDF

Objective: The integration of physical therapy (PT), occupational therapy (OT), and speech therapy (ST) into a triple therapy approach has gained recognition in the rehabilitation of patients. The integration of PT-OT-ST triple therapy with accelerated recovery strategies in pulmonary rehabilitation for elderly mechanically ventilated patients is anticipated to overcome the limitations of traditional rehabilitation approaches.

Methods: By applying stringent inclusion and exclusion criteria, a total of 60 elderly patients over 60 years old requiring mechanical ventilation were selected.

View Article and Find Full Text PDF

This article explores the existing research evidence on the potential effectiveness of lipreading as a communication strategy to enhance speech recognition in individuals with hearing impairment. A scoping review was conducted, involving a search of six electronic databases (MEDLINE, Embase, Web of Science, Engineering Village, CINAHL, and PsycINFO) for research papers published between January 2013 and June 2023. This study included original research papers with full texts available in English, covering all study designs: qualitative, quantitative, and mixed methods.

View Article and Find Full Text PDF

Visual Impairment Spatial Awareness System for Indoor Navigation and Daily Activities.

J Imaging

January 2025

Department of Electrical and Computer Engineering, Illinois Institute of Technology, Chicago, IL 60616, USA.

The integration of artificial intelligence into daily life significantly enhances the autonomy and quality of life of visually impaired individuals. This paper introduces the Visual Impairment Spatial Awareness (VISA) system, designed to holistically assist visually impaired users in indoor activities through a structured, multi-level approach. At the foundational level, the system employs augmented reality (AR) markers for indoor positioning, neural networks for advanced object detection and tracking, and depth information for precise object localization.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!