Speech impairment resulting from laryngectomy causes severe physiological and psychological distress to laryngectomee. In clinical practice, the upper vocal tract articulatory organs function normally in most laryngectomee. The potential to reconstruct speech by leveraging articulatory information is of significant importance, offering a meaningful contribution to the effective rehabilitation of speech in these patients. To begin, we created a Mandarin corpus, capturing simultaneous dynamic tongue motion ultrasound images and speech waveform during experiment. Then we utilized an autoencoder to extract deep representation from ultrasound images. Building on this, a speech waveform generation model was established using generative adversarial networks, and both objective and subjective evaluations were conducted to access the quality of the reconstructed speech. The results reveal that the phoneme accuracy of the reconstructed speech reaches 72.43%, with accuracy of Mandarin tones being 76.10%. Observing the mel-spectrogram and fundamental frequency contour, the reconstructed speech shows a high degree of similarity to original speech. Additionally, subjective speech perceptions of the reconstructed speech affirm its acceptability (mean opinion score > 6). The method presented in this paper enables to reconstruct tonal Mandarin speech from dynamic tongue motion ultrasound images. However, future research should focus on specific conditions of laryngectomee, improving and optimizing model performance, expanding training datasets, and enhancing the quality of reconstructed speech.

Download full-text PDF

Source
http://dx.doi.org/10.1109/EMBC53108.2024.10781847DOI Listing

Publication Analysis

Top Keywords

reconstructed speech
20
ultrasound images
16
speech
13
tongue motion
12
motion ultrasound
12
mandarin speech
8
generative adversarial
8
adversarial networks
8
dynamic tongue
8
speech waveform
8

Similar Publications

Background: The choice between free flaps and locoregional flaps for soft tissue reconstruction in oral cavity cancer patients is critical for determining long-term functional and oncological outcomes. This systematic review evaluates the efficacy of these reconstructive techniques, focusing on survival, recurrence, quality of life (QoL), and functional parameters such as speech, swallowing, and the need for gastrostomy or tracheostomy.

Methods: A systematic review adhering to PRISMA guidelines was conducted using PubMed, Scopus, Cochrane, and EBSCO databases.

View Article and Find Full Text PDF

Pharyngeal wall motion is a key component of velopharyngeal closure, essential for normal speech production. This study investigated changes in lateral pharyngeal wall motion in patients with cleft palate who required secondary surgery to correct velopharyngeal dysfunction. A retrospective review was conducted at a tertiary pediatric hospital, including 20 patients who underwent secondary procedures between 2015 and 2021.

View Article and Find Full Text PDF

Maxillary defects resulting from oncologic resection pose significant challenges for oral rehabilitation, affecting function, aesthetics, and quality of life. Traditional implant-based solutions are often unfeasible due to insufficient bone volume, necessitating alternative approaches. This case report presents a 54-year-old male who underwent a total maxillectomy for palatal squamous cell carcinoma, followed by chemoradiotherapy.

View Article and Find Full Text PDF

Speech Acoustic Analysis in Adult Patients With Cleft Palate After Cleft Palate Repair and Speech Therapy.

J Craniofac Surg

August 2024

Department of Oral and Maxillofacial Surgery, Shanghai Ninth People's Hospital, College of Stomatology, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Oral Diseases, Shanghai Key Laboratory of Stomatology and Shanghai Research Institute of Stomatology, Shanghai, China.

Objective: This study aims to evaluate the enhancement of speech functionality in adult patients with cleft palate through acoustic analysis, assessing pronunciation level improvements before and after palatopharyngoplasty and speech treatment. The findings aim to provide an objective assessment of the treatment efficacy for older patients with cleft palate.

Participants And Intervention: The study involved acoustic comparisons encompassing vowel formants, voice onset time (VOT) of consonant syllables, syllable duration, and voice characteristic analysis.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!