Automatic pronunciation error detection in non-native speech: the case of vowel errors in Dutch.

J Acoust Soc Am

Centre for Language and Speech Technology, Radboud University Nijmegen, Erasmusplein 1, 6525HT, Nijmegen, The Netherlands.

Published: August 2013

This research is aimed at analyzing and improving automatic pronunciation error detection in a second language. Dutch vowels spoken by adult non-native learners of Dutch are used as a test case. A first study on Dutch pronunciation by L2 learners with different L1s revealed that vowel pronunciation errors are relatively frequent and often concern subtle acoustic differences between the realization and the target sound. In a second study automatic pronunciation error detection experiments were conducted to compare existing measures to a metric that takes account of the error patterns observed to capture relevant acoustic differences. The results of the two studies do indeed show that error patterns bear information that can be usefully employed in weighted automatic measures of pronunciation quality. In addition, it appears that combining such a weighted metric with existing measures improves the equal error rate by 6.1 percentage points from 0.297, for the Goodness of Pronunciation (GOP) algorithm, to 0.236.

Download full-text PDF

Source
http://dx.doi.org/10.1121/1.4813304DOI Listing

Publication Analysis

Top Keywords

automatic pronunciation
12
pronunciation error
12
error detection
12
acoustic differences
8
existing measures
8
error patterns
8
error
6
pronunciation
6
automatic
4
detection non-native
4

Similar Publications

Background: Cochlear implants (CIs) have the potential to facilitate auditory restoration in deaf children and contribute to the maturation of the auditory cortex. The type of CI may impact hearing rehabilitation in children with CI. We aimed to study central auditory processing activation patterns during speech perception in Mandarin-speaking pediatric CI recipients with different device characteristics.

View Article and Find Full Text PDF

Variability in speech pronunciation is widely observed across different linguistic backgrounds, which impacts modern automatic speech recognition performance. Here, we evaluate the performance of a self-supervised speech model in phoneme recognition using direct articulatory evidence. Findings indicate significant differences in phoneme recognition, especially in front vowels, between American English and Indian English speakers.

View Article and Find Full Text PDF

Verifying the speaker of a speech fragment can be crucial in attributing a crime to a suspect. The question can be addressed given disputed and reference speech material, adopting the recommended and scientifically accepted likelihood ratio framework for reporting evidential strength in court. In forensic practice, usually, auditory and acoustic analyses are performed to carry out such a verification task considering a diversity of features, such as language competence, pronunciation, or other linguistic features.

View Article and Find Full Text PDF

MKELM based multi-classification model for foreign accent identification.

Heliyon

August 2024

Department of Information Engineering, Electronics and Telecommunication, Sapienza University Rome, Rome, 00184, Italy.

The automatic identification of foreign accents can play a crucial role in various speech systems, including speaker identification, e-learning, telephone banking, and more. Additionally, it can greatly enhance the robustness of Automatic Speech Recognition (ASR) systems. Non-native accents in speech signals are characterized by distinct pronunciations, prosody, and voice characteristics of the speaker.

View Article and Find Full Text PDF

Automatic speech recognition (ASR) for the diagnosis of pronunciation of speech sound disorders in Korean children.

Clin Linguist Phon

August 2024

Department of Rehabilitation Medicine, Incheon St.Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea.

Article Synopsis
  • The study develops an automatic speech recognition (ASR) model specifically to diagnose pronunciation problems in children with speech sound disorders (SSDs), aiming to replace manual transcription methods.
  • The researchers fine-tuned the wav2vec2.0 XLS-R model to better recognize the way children with SSDs pronounce words, achieving a Phoneme Error Rate (PER) of only 10%.
  • In comparison, a leading ASR model called Whisper struggled with this task, showing a much higher PER of about 50%, highlighting the need for more specialized ASR approaches in clinical settings.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!