Silent speech interfaces (SSI) enable the generation of audio speech or readable texts without vocalization. Electromyography (EMG), being one of the possible source signals of SSI, demonstrates its superiority, particularly for individuals with vocal organ injuries. In this work, we propose a self-pretraining framework, i.e. emg2vec, in EMG-based SSI, including EMG-to-speech and EMG-to-text conversion. Our experiments reveal that self-pretraining yields improvements compared to plain supervised learning. Our experiments show that, compared to training the models from scratch, self-pretraining improves the downstream speech recognition word error rate (WER) relatively by 7.32% when utilizing the entire labeled dataset and by 5.18% when employing only a 20% fraction of the labeled data for supervised training. The improvement also happens in speech synthesis, but only by 2.91% when using 20% of training data.

Download full-text PDF

Source
http://dx.doi.org/10.1109/EMBC53108.2024.10781736DOI Listing

Publication Analysis

Top Keywords

silent speech
8
speech interfaces
8
speech
5
emg2vec self-supervised
4
self-supervised pretraining
4
pretraining electromyography-based
4
electromyography-based silent
4
interfaces silent
4
interfaces ssi
4
ssi enable
4

Similar Publications

Rhinoplasty is the most common cosmetic surgery procedure in Iran. One of the complications of this procedure that has been less considered is the probable effect of rhinoplasty on voice. This study aimed to assess the influence of rhinoplasty on acoustic characteristics of resonance and sound production.

View Article and Find Full Text PDF

Pausing patterns in English school-age children with a history of late talking: Frequent pauses and prolonged response delays.

J Commun Disord

March 2025

Institute of Language Sciences, Shanghai International Studies University, China; Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, China; National Research Centre for Language and Well-Being, Shanghai, China. Electronic address:

Introduction: This study explored silent pause patterns, their interaction with filled pauses, and response delays in five-year-old children who were previously identified as late talkers in their conversations with adults.

Methods: We analyzed 73 child-adult conversations (36 with a late-talking history, 37 typically developing) from the CHILDES Clinical English Ellis Weismer Corpus at age five across three temporal stages. Using Praat, we identified and classified silent pauses (> 250 ms) by duration and position and annotated them across three tiers: silent pause categories, pauses near filled pauses, and response delays.

View Article and Find Full Text PDF

Silent speech interfaces (SSI) enable the generation of audio speech or readable texts without vocalization. Electromyography (EMG), being one of the possible source signals of SSI, demonstrates its superiority, particularly for individuals with vocal organ injuries. In this work, we propose a self-pretraining framework, i.

View Article and Find Full Text PDF

Electromyography-to-Speech (ETS) conversion has demonstrated its potential for silent speech interfaces by generating audible speech from Electromyography (EMG) signals during silent articulations. ETS models usually consist of an EMG encoder which converts EMG signals to acoustic speech features, and a vocoder which then synthesises the speech signals. Due to an inadequate amount of available data and noisy signals, the synthesised speech often exhibits a low level of naturalness.

View Article and Find Full Text PDF

Impact of Transcutaneous Electrical Stimulation on Oral Moisture in Older Adults with and without Xerostomia: A Pilot Study.

Folia Phoniatr Logop

February 2025

Swallowing Physiology and Rehabilitation Research Laboratory, Speech Pathology and Audiology Program, Kent State University, Kent, Ohio, USA.

Introduction: Xerostomia, or dry mouth, is a prevalent and distressing oral health condition in older adults that is associated with reduced swallow frequency, thereby increasing the risk of dysphagia and aspiration pneumonia in this cohort. This pseudo-experimental study investigated the association between transcutaneous electrical stimulation (TES) and changes in perceived oral moisture, as well as the function of major and minor salivary glands in two groups of older adults, including those with and without xerostomia.

Methods: Ten older adults with self-reported xerostomia and 7 control participants were exposed to two conditions: no TES and motor TES.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!