In most silent speech research, continuously observing tongue movements is crucial, thus requiring the use of ultrasound to extract tongue contours. Precisely and in real-time extracting ultrasonic tongue contours presents a major challenge. To tackle this challenge, the novel end-to-end lightweight network DAFT-Net is introduced for ultrasonic tongue contour extraction. Integrating the Convolutional Block Attention Module (CBAM) and Attention Gate (AG) module with entropy-based optimization strategies, DAFT-Net establishes a comprehensive attention mechanism with dual functionality. This innovative approach enhances feature representation by replacing traditional skip connection architecture, thus leveraging entropy and information-theoretic measures to ensure efficient and precise feature selection. Additionally, the U-Net's encoder and decoder layers have been streamlined to reduce computational demands. This process is further supported by information theory, thus guiding the reduction without compromising the network's ability to capture and utilize critical information. Ablation studies confirm the efficacy of the integrated attention module and its components. The comparative analysis of the NS, TGU, and TIMIT datasets shows that DAFT-Net efficiently extracts relevant features, and it significantly reduces extraction time. These findings demonstrate the practical advantages of applying entropy and information theory principles. This approach improves the performance of medical image segmentation networks, thus paving the way for real-world applications.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11202898PMC
http://dx.doi.org/10.3390/e26060482DOI Listing

Publication Analysis

Top Keywords

tongue contour
8
contour extraction
8
tongue contours
8
ultrasonic tongue
8
attention module
8
attention
5
tongue
5
daft-net
4
daft-net dual
4
dual attention
4

Similar Publications

Objective: The purpose of this study was to investigate the technical feasibility of integrating the quantitative maps available from SyntheticMR into the head and neck adaptive radiation oncology workflow. While SyntheticMR has been investigated for diagnostic applications, no studies have investigated its feasibility and potential for MR-Simulation or MR-Linac workflow. Demonstrating the feasibility of using this technique will facilitate rapid quantitative biomarker extraction which can be leveraged to guide adaptive radiation therapy decision making.

View Article and Find Full Text PDF

An artificial intelligence (AI) model was designed to assist pathologists in diagnosing and quantifying structural changes in tongue lesions induced by chemical carcinogens. Using a tongue cancer model induced by 4-nitroquinoline-N-oxide and treated with β-elemene, a total of 183 digital pathology slides were processed. The Segment Anything Model (SAM) was employed for initial segmentation, followed by conventional algorithms for more detailed segmentation.

View Article and Find Full Text PDF

In most silent speech research, continuously observing tongue movements is crucial, thus requiring the use of ultrasound to extract tongue contours. Precisely and in real-time extracting ultrasonic tongue contours presents a major challenge. To tackle this challenge, the novel end-to-end lightweight network DAFT-Net is introduced for ultrasonic tongue contour extraction.

View Article and Find Full Text PDF

The goal of this article is to illustrate the use of MRI for exploring bi- and multi-lingual articulatory strategies. One male and one female speaker recorded sets of static midsagittal MRIs of the whole vocal tract, producing vowels as well as consonants in various vowel contexts in either the male's two or the female's three languages. Both speakers were native speakers of English (American and Australian English, respectively), and both were fluent L2 speakers of French.

View Article and Find Full Text PDF

Foreign language acquisition of perceptually similar segments: evidence from Lower Sorbian.

Open Res Eur

February 2024

Leibniz-Zentrum Allgemeine Sprachwissenschaft, Berlin, 10117, Germany.

Lower Sorbian is a moribund language spoken in Eastern Germany that features a three-way sibilant contrast, /s, ʂ, ɕ/. The vast majority of L1 speakers are above eighty years of age and virtually no young Sorbians learn Lower Sorbian as their first language. There are language revitalization programs in place, but this means that virtually all Lower Sorbian speakers are L2 learners whose first language is German.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!