MPSA-Conformer-CTC/Attention: A High-Accuracy, Low-Complexity End-to-End Approach for Tibetan Speech Recognition.

Sensors (Basel)

School of Mechanical and Electrical Engineering, Huainan Normal University, Huainan 232001, China.

Published: October 2024

This study addresses the challenges of low accuracy and high computational demands in Tibetan speech recognition by investigating the application of end-to-end networks. We propose a decoding strategy that integrates Connectionist Temporal Classification (CTC) and Attention mechanisms, capitalizing on the benefits of automatic alignment and attention weight extraction. The Conformer architecture is utilized as the encoder, leading to the development of the Conformer-CTC/Attention model. This model first extracts global features from the speech signal using the Conformer, followed by joint decoding of these features through CTC and Attention mechanisms. To mitigate convergence issues during training, particularly with longer input feature sequences, we introduce a Probabilistic Sparse Attention mechanism within the joint CTC/Attention framework. Additionally, we implement a maximum entropy optimization algorithm for CTC, effectively addressing challenges such as increased path counts, spike distributions, and local optima during training. We designate the proposed method as the MaxEnt-Optimized Probabilistic Sparse Attention Conformer-CTC/Attention Model (MPSA-Conformer-CTC/Attention). Experimental results indicate that our improved model achieves a word error rate reduction of 10.68% and 9.57% on self-constructed and open-source Tibetan datasets, respectively, compared to the baseline model. Furthermore, the enhanced model not only reduces memory consumption and training time but also improves generalization capability and accuracy.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11548342PMC
http://dx.doi.org/10.3390/s24216824DOI Listing

Publication Analysis

Top Keywords

tibetan speech
8
speech recognition
8
ctc attention
8
attention mechanisms
8
conformer-ctc/attention model
8
probabilistic sparse
8
sparse attention
8
model
6
attention
5
mpsa-conformer-ctc/attention high-accuracy
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!