Domain Generalization for Language-Independent Automatic Speech Recognition.

Front Artif Intell

Department of Electrical and Computer Engineering (ECE), Beckman Institute, University of Illinois, Urbana, IL, United States.

Published: May 2022

A language-independent automatic speech recognizer (ASR) is one that can be used for phonetic transcription in languages other than the languages in which it was trained. Language-independent ASR is difficult to train, because different languages implement phones differently: even when phonemes in two different languages are written using the same symbols in the international phonetic alphabet, they are differentiated by different distributions of language-dependent redundant articulatory features. This article demonstrates that the goal of language-independence may be approximated in different ways, depending on the size of the training set, the presence vs. absence of familial relationships between the training and test languages, and the method used to implement phone recognition or classification. When the training set contains many languages, and when every language in the test set is related (shares the same language family with) a language in the training set, then language-independent ASR may be trained using an empirical risk minimization strategy (e.g., using connectionist temporal classification without extra regularizers). When the training set is limited to a small number of languages from one language family, however, and the test languages are not from the same language family, then the best performance is achieved by using domain-invariant representation learning strategies. Two different representation learning strategies are tested in this article: invariant risk minimization, and regret minimization. We find that invariant risk minimization is better at the task of phone token classification (given known segment boundary times), while regret minimization is better at the task of phone token recognition.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9133481PMC
http://dx.doi.org/10.3389/frai.2022.806274DOI Listing

Publication Analysis

Top Keywords

training set
16
languages language
12
language family
12
risk minimization
12
language-independent automatic
8
automatic speech
8
languages
8
language-independent asr
8
test languages
8
representation learning
8

Similar Publications

Supporting teams with designing for dissemination and sustainability: the design, development, and usability of a digital interactive platform.

Implement Sci

December 2024

Division of General Internal Medicine, Colorado Clinical & Translational Sciences Institute, and the Adult & Child Center for Outcomes Research & Delivery Science, University of Colorado School of Medicine, 1890 N. Revere Ct., Aurora, CO, 80045, USA.

Background: Designing for Dissemination and Sustainability (D4DS) principles and methods can support the development of research products (interventions, tools, findings) that match well with the needs and context of the intended audience and setting. D4DS principles and methods are not well-known or used during clinical and public health research; research teams would benefit from applying D4DS. This paper presents the development of a new digital platform for research teams to learn and apply a D4DS process to their work.

View Article and Find Full Text PDF

Background: Anxiety during pregnancy is linked to adverse maternal and neonatal outcomes, as well as dissatisfaction with childbirth, and may contribute to the development of postpartum depression. This study aimed to evaluate the effect of mindfulness-based counselling on the anxiety levels and childbirth satisfaction among primiparous pregnant women.

Methods: This two-group, parallel, randomized controlled trial involved 60 eligible primiparous pregnant women who were referred to health centers in Kermanshah province (western Iran).

View Article and Find Full Text PDF

Given the heterogeneous nature of attention-deficit/hyperactivity disorder (ADHD) and the absence of established biomarkers, accurate diagnosis and effective treatment remain a challenge in clinical practice. This study investigates the predictive utility of multimodal data, including eye tracking, EEG, actigraphy, and behavioral indices, in differentiating adults with ADHD from healthy individuals. Using a support vector machine model, we analyzed independent training (n = 50) and test (n = 36) samples from two clinically controlled studies.

View Article and Find Full Text PDF

Background: Chronic kidney disease (CKD) presents a significant global health challenge. Nephrology nurses, possessing specialized competencies, play an essential role in providing high-quality care to CKD patients.

Aim: This scoping review aims to comprehensively map and synthesize literature on the competencies of nephrology nurses worldwide.

View Article and Find Full Text PDF

Background: Although consent has long been accepted as necessary in maternity care, the concept of informed consent for planned vaginal birth has polarised maternity politics. The publication of the NSW Consent Manual outlines new standards of informed consent, signalling the need for examination of current maternity consent practices.

Aims: To examine informed consent and disclosure of material risks in birth in a prospective qualitative study of midwives and obstetricians.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!