Neural oscillation is attracting attention as an underlying mechanism for speech recognition. Speech intelligibility is enhanced by the synchronization of speech rhythms and slow neural oscillation, which is typically observed as human scalp electroencephalography (EEG). In addition to the effect of neural oscillation, it has been proposed that speech recognition is enhanced by the identification of a speaker's motor signals, which are used for speech production. To verify the relationship between the effect of neural oscillation and motor cortical activity, we measured scalp EEG, and simultaneous EEG and functional magnetic resonance imaging (fMRI) during a speech recognition task in which participants were required to recognize spoken words embedded in noise sound. We proposed an index to quantitatively evaluate the EEG phase effect on behavioral performance. The results showed that the delta and theta EEG phase before speech inputs modulated the participant's response time when conducting speech recognition tasks. The simultaneous EEG-fMRI experiment showed that slow EEG activity was correlated with motor cortical activity. These results suggested that the effect of the slow oscillatory phase was associated with the activity of the motor cortex during speech recognition.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5554004PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0183146PLOS

Publication Analysis

Top Keywords

speech recognition
20
neural oscillation
16
motor cortical
12
cortical activity
12
speech
10
slow oscillatory
8
oscillatory phase
8
speech intelligibility
8
eeg phase
8
eeg
6

Similar Publications

Objective: Measuring listening effort using pupillometry is challenging in cochlear implant (CI) users. We assess three validated speech tests (Matrix, LIST, and DIN) to identify the optimal speech material for measuring peak-pupil-dilation (PPD) in CI users as a function of signal-to-noise ratio (SNR).

Design: Speech tests were administered in quiet and two noisy conditions, namely at the speech recognition threshold (0 dB re SRT), i.

View Article and Find Full Text PDF

Tibetan-Chinese speech-to-speech translation based on discrete units.

Sci Rep

January 2025

Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE, Minzu University of China, Beijing, 100081, China.

Speech-to-speech translation (S2ST) has evolved from cascade systems which integrate Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS), to end-to-end models. This evolution has been driven by advancements in model performance and the expansion of cross-lingual speech datasets. Despite the paucity of research on Tibetan speech translation, this paper endeavors to tackle the challenge of Tibetan-to-Chinese direct speech-to-speech translation within the multi-task learning framework, employing self-supervised learning (SSL) and sequence-to-sequence model training.

View Article and Find Full Text PDF

Some Challenging Questions About Outcomes in Children With Cochlear Implants.

Perspect ASHA Spec Interest Groups

December 2024

DeVault Otologic Research Laboratory, Department of Otolaryngology-Head and Neck Surgery, Indiana University School of Medicine, Indianapolis.

Purpose: Cochlear implants (CIs) have improved the quality of life for many children with severe-to-profound sensorineural hearing loss. Despite the reported CI benefits of improved speech recognition, speech intelligibility, and spoken language processing, large individual differences in speech and language outcomes are still consistently reported in the literature. The enormous variability in CI outcomes has made it challenging to predict which children may be at high risk for limited benefits and how potential risk factors can be improved with interventions.

View Article and Find Full Text PDF

Introduction: It is still under debate whether and how semantic content will modulate the emotional prosody perception in children with autism spectrum disorder (ASD). The current study aimed to investigate the issue using two experiments by systematically manipulating semantic information in Chinese disyllabic words.

Method: The present study explored the potential modulation of semantic content complexity on emotional prosody perception in Mandarin-speaking children with ASD.

View Article and Find Full Text PDF

Artificial intelligence (AI) scribe applications in the healthcare community are in the early adoption phase and offer unprecedented efficiency for medical documentation. They typically use an application programming interface with a large language model (LLM), for example, generative pretrained transformer 4. They use automatic speech recognition on the physician-patient interaction, generating a full medical note for the encounter, together with a draft follow-up e-mail for the patient and, often, recommendations, all within seconds or minutes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!