Silent speech interfaces (SSI) enable the generation of audio speech or readable texts without vocalization. Electromyography (EMG), being one of the possible source signals of SSI, demonstrates its superiority, particularly for individuals with vocal organ injuries. In this work, we propose a self-pretraining framework, i.e. emg2vec, in EMG-based SSI, including EMG-to-speech and EMG-to-text conversion. Our experiments reveal that self-pretraining yields improvements compared to plain supervised learning. Our experiments show that, compared to training the models from scratch, self-pretraining improves the downstream speech recognition word error rate (WER) relatively by 7.32% when utilizing the entire labeled dataset and by 5.18% when employing only a 20% fraction of the labeled data for supervised training. The improvement also happens in speech synthesis, but only by 2.91% when using 20% of training data.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/EMBC53108.2024.10781736 | DOI Listing |
Indian J Otolaryngol Head Neck Surg
January 2025
Sinus and Surgical Endoscopic Research Center, Mashhad University of Medical Sciences, Mashhad, Iran.
Rhinoplasty is the most common cosmetic surgery procedure in Iran. One of the complications of this procedure that has been less considered is the probable effect of rhinoplasty on voice. This study aimed to assess the influence of rhinoplasty on acoustic characteristics of resonance and sound production.
View Article and Find Full Text PDFJ Commun Disord
March 2025
Institute of Language Sciences, Shanghai International Studies University, China; Speech-Language-Hearing Center, School of Foreign Languages, Shanghai Jiao Tong University, China; National Research Centre for Language and Well-Being, Shanghai, China. Electronic address:
Introduction: This study explored silent pause patterns, their interaction with filled pauses, and response delays in five-year-old children who were previously identified as late talkers in their conversations with adults.
Methods: We analyzed 73 child-adult conversations (36 with a late-talking history, 37 typically developing) from the CHILDES Clinical English Ellis Weismer Corpus at age five across three temporal stages. Using Praat, we identified and classified silent pauses (> 250 ms) by duration and position and annotated them across three tiers: silent pause categories, pauses near filled pauses, and response delays.
Annu Int Conf IEEE Eng Med Biol Soc
July 2024
Silent speech interfaces (SSI) enable the generation of audio speech or readable texts without vocalization. Electromyography (EMG), being one of the possible source signals of SSI, demonstrates its superiority, particularly for individuals with vocal organ injuries. In this work, we propose a self-pretraining framework, i.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
July 2024
Electromyography-to-Speech (ETS) conversion has demonstrated its potential for silent speech interfaces by generating audible speech from Electromyography (EMG) signals during silent articulations. ETS models usually consist of an EMG encoder which converts EMG signals to acoustic speech features, and a vocoder which then synthesises the speech signals. Due to an inadequate amount of available data and noisy signals, the synthesised speech often exhibits a low level of naturalness.
View Article and Find Full Text PDFFolia Phoniatr Logop
February 2025
Swallowing Physiology and Rehabilitation Research Laboratory, Speech Pathology and Audiology Program, Kent State University, Kent, Ohio, USA.
Introduction: Xerostomia, or dry mouth, is a prevalent and distressing oral health condition in older adults that is associated with reduced swallow frequency, thereby increasing the risk of dysphagia and aspiration pneumonia in this cohort. This pseudo-experimental study investigated the association between transcutaneous electrical stimulation (TES) and changes in perceived oral moisture, as well as the function of major and minor salivary glands in two groups of older adults, including those with and without xerostomia.
Methods: Ten older adults with self-reported xerostomia and 7 control participants were exposed to two conditions: no TES and motor TES.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!