Towards Naturalistic Speech Decoding from Intracranial Brain Data.

Julia Berezutskaya Luca Ambrogioni Nick F Ramsey Marcel A J van Gerven

Annu Int Conf IEEE Eng Med Biol Soc

Published: July 2022

Speech decoding from brain activity can enable development of brain-computer interfaces (BCIs) to restore naturalistic communication in paralyzed patients. Previous work has focused on development of decoding models from isolated speech data with a clean background and multiple repetitions of the material. In this study, we describe a novel approach to speech decoding that relies on a generative adversarial neural network (GAN) to reconstruct speech from brain data recorded during a naturalistic speech listening task (watching a movie). We compared the GAN-based approach, where reconstruction was done from the compressed latent representation of sound decoded from the brain, with several baseline models that reconstructed sound spectrogram directly. We show that the novel approach provides more accurate reconstructions compared to the baselines. These results underscore the potential of GAN models for speech decoding in naturalistic noisy environments and further advancing of BCIs for naturalistic communication. Clinical Relevance - This study presents a novel speech decoding paradigm that combines advances in deep learning, speech synthesis and neural engineering, and has the potential to advance the field of BCI for severely paralyzed individuals.

Download full-text PDF	Source
http://dx.doi.org/10.1109/EMBC48229.2022.9871301	DOI Listing

Publication Analysis

Top Keywords

speech decoding

naturalistic speech

brain data

speech

naturalistic communication

novel approach

decoding

naturalistic

decoding intracranial

brain

Similar Publications

Improving auditory attention decoding by classifying intracranial responses to glimpsed and masked acoustic events.

Imaging Neurosci (Camb)

April 2024

Department of Electrical Engineering, Columbia University, New York, NY, United States.

Vinay S Raghavan James O'Sullivan Jose Herrero Stephan Bickel Ashesh D Mehta

Listeners with hearing loss have trouble following a conversation in multitalker environments. While modern hearing aids can generally amplify speech, these devices are unable to tune into a target speaker without first knowing to which speaker a user aims to attend. Brain-controlled hearing aids have been proposed using auditory attention decoding (AAD) methods, but current methods use the same model to compare the speech stimulus and neural response, regardless of the dynamic overlap between talkers which is known to influence neural encoding.

View Article and Find Full Text PDF

Similar Publications

Beyond Averaging: A Transformer Approach to Decoding Event Related Brain Potentials.

Neuroimage

January 2025

Department of Computer Science, University of Innsbruck, Technikerstrasse 21a, Innsbruck, 6020, Austria. Electronic address:

Philipp Zelger Manuel Arnold Sonja Rossi Josef Seebacher Franz Muigg

The objective of this study is to assess the potential of a transformer-based deep learning approach applied to event-related brain potentials (ERPs) derived from electroencephalographic (EEG) data. Traditional methods involve averaging the EEG signal of multiple trials to extract valuable neural signals from the high noise content of EEG data. However, this averaging technique may conceal relevant information.

View Article and Find Full Text PDF

Similar Publications

Ultra high density imaging arrays in diffuse optical tomography for human brain mapping improve image quality and decoding performance.

Sci Rep

January 2025

Mallinckrodt Institute of Radiology, Washington University School of Medicine, 4515 McKinley Ave., St. Louis, MO, 63110, USA.

Zachary E Markow Jason W Trobaugh Edward J Richter Kalyan Tripathy Sean M Rafferty

Functional magnetic resonance imaging (fMRI) has dramatically advanced non-invasive human brain mapping and decoding. Functional near-infrared spectroscopy (fNIRS) and high-density diffuse optical tomography (HD-DOT) non-invasively measure blood oxygen fluctuations related to brain activity, like fMRI, at the brain surface, using more-lightweight equipment that circumvents ergonomic and logistical limitations of fMRI. HD-DOT grids have smaller inter-optode spacing (~ 13 mm) than sparse fNIRS (~ 30 mm) and therefore provide higher image quality, with spatial resolution ~ 1/2 that of fMRI, when using the several source-detector distances (13-40 mm) afforded by the HD-DOT grid.

View Article and Find Full Text PDF

Similar Publications

An End-To-End Speech Recognition Model for the North Shaanxi Dialect: Design and Evaluation.

Sensors (Basel)

January 2025

SHCCIG Yubei Coal Industry Co., Ltd., Xi'an 710900, China.

Yi Qin Feifan Yu

The coal mining industry in Northern Shaanxi is robust, with a prevalent use of the local dialect, known as "Shapu", characterized by a distinct Northern Shaanxi accent. This study addresses the practical need for speech recognition in this dialect. We propose an end-to-end speech recognition model for the North Shaanxi dialect, leveraging the Conformer architecture.

View Article and Find Full Text PDF

Similar Publications

Multisensory naturalistic decoding with high-density diffuse optical tomography.

Neurophotonics

January 2025

Washington University School of Medicine, Mallinckrodt Institute of Radiology, St. Louis, Missouri, United States.

Kalyan Tripathy Zachary E Markow Morgan Fogarty Mariel L Schroeder Alexa M Svoboda

Significance: Decoding naturalistic content from brain activity has important neuroscience and clinical implications. Information about visual scenes and intelligible speech has been decoded from cortical activity using functional magnetic resonance imaging (fMRI) and electrocorticography, but widespread applications are limited by the logistics of these technologies.

Aim: High-density diffuse optical tomography (HD-DOT) offers image quality approaching that of fMRI but with the silent, open scanning environment afforded by optical methods, thus opening the door to more naturalistic research and applications.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!