Speech Recognition for the iCub Platform.

Bertrand Higy Alessio Mereta Giorgio Metta Leonardo Badino

Front Robot AI

Center for Translational Neurophysiology of Speech and Communication, Istituto Italiano di Tecnologia, Ferrara, Italy.

Published: February 2018

This paper describes open source software (available at https://github.com/robotology/natural-speech) to build automatic speech recognition (ASR) systems and run them within the YARP platform. The toolkit is designed (i) to allow non-ASR experts to easily create their own ASR system and run it on iCub and (ii) to build deep learning-based models specifically addressing the main challenges an ASR system faces in the context of verbal human-iCub interactions. The toolkit mostly consists of Python, C++ code and shell scripts integrated in YARP. As additional contribution, a second codebase (written in Matlab) is provided for more expert ASR users who want to experiment with bio-inspired and developmental learning-inspired ASR systems. Specifically, we provide code for two distinct kinds of speech recognition: "articulatory" and "unsupervised" speech recognition. The first is largely inspired by influential neurobiological theories of speech perception which assume speech perception to be mediated by brain motor cortex activities. Our articulatory systems have been shown to outperform strong deep learning-based baselines. The second type of recognition systems, the "unsupervised" systems, do not use any supervised information (contrary to most ASR systems, including our articulatory systems). To some extent, they mimic an infant who has to discover the basic speech units of a language by herself. In addition, we provide resources consisting of pre-trained deep learning models for ASR, and a 2.5-h speech dataset of spoken commands, the VoCub dataset, which can be used to adapt an ASR system to the typical acoustic environments in which iCub operates.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805979	PMC
http://dx.doi.org/10.3389/frobt.2018.00010	DOI Listing

Publication Analysis

Top Keywords

speech recognition

asr systems

asr system

speech

asr

deep learning-based

speech perception

articulatory systems

systems

recognition icub

Similar Publications

Understanding Parental Perspectives on Childhood Hearing Impairment and Timely Interventions.

Cureus

January 2025

College of Medicine, Department of Otolaryngology - Head and Neck Surgery, University of Jeddah, Jeddah, SAU.

Nada Alharbi Daniyah Baqalaqil Hams Alharthi Nouf Almalki Samar Altoukhi

Objectives: Hearing impairment during childhood is a widespread health issue. Prompt recognition and timely intervention are vital for the advancement of language skills. Insufficient parental knowledge can lead to a delay in diagnosing and treating a condition, which can have a negative impact on academic performance.

View Article and Find Full Text PDF

Similar Publications

Hearing Performance and Soft-Tissue Outcomes of Minimally Invasive Ponto Surgery and Local Anesthesia in Children with Unilateral Craniofacial Malformation.

Int Arch Otorhinolaryngol

January 2025

School of Medical Sciences, Santa Casa de São Paulo, São Paulo, SP, Brazil.

Andrea Caruso Leone Arthur Menino Castilho Fabiana Danieli Daniela Bortoloti Calil Katia de Almeida

Minimally invasive Ponto surgery (MIPS) enables the installation of percutaneous bone-anchored hearing implants (BAHIs) with a drill guide through a hole punch incision. Despite being well established for adults, there is a lack of studies in the literature regarding its use in pediatric patients. The aim of the present study was to investigate the hearing performance and soft-tissue outcomes of the use of MIPS under local anesthesia in children with unilateral craniofacial malformation (UCM).

View Article and Find Full Text PDF

Similar Publications

Effects of Interaural Latency and Frequency Mismatch on Speech Recognition for Bimodal Cochlear Implant Users.

Laryngoscope

January 2025

Department of Otolaryngology/Head & Neck Surgery, University of North Carolina School of Medicine, Chapel Hill, North Carolina, U.S.A.

Margaret T Dillon Emily Buss Margaret E Richter Kevin D Brown

Objectives: Bimodal cochlear implant (CI) users vary in speech recognition outcomes. This variability may be influenced partly by the CI and contralateral hearing aid (HA) programming procedures, which can result in mismatches in latency and frequency. We assessed the performance of bimodal listeners when latency mismatches were corrected and analyzed how frequency mismatches influenced outcomes.

View Article and Find Full Text PDF

Similar Publications

Automatic speech recognition predicts contemporaneous earthquake fault displacement.

Nat Commun

January 2025

Los Alamos National Laboratory, EES-17 National Security Earth Science, Los Alamos, NM, 87545, USA.

Christopher W Johnson Kun Wang Paul A Johnson

Significant progress has been made in probing the state of an earthquake fault by applying machine learning to continuous seismic waveforms. The breakthroughs were originally obtained from laboratory shear experiments and numerical simulations of fault shear, then successfully extended to slow-slipping faults. Here we apply the Wav2Vec-2.

View Article and Find Full Text PDF

Similar Publications

Accuracy and variability in clinical predictions of speech recognition outcomes for cochlear implant users.

Int J Audiol

January 2025

Department of Otolaryngology, Vanderbilt University Medical Center, Nashville, TN, USA.

Valeriy Shafiro Michael S Harris Berenice Ramirez Liping Du Aaron C Moberly

Objectives: An improvement in speech perception is a major well-documented benefit of cochlear implantation (CI), which is commonly discussed with CI candidates to set expectations. However, a large variability exists in speech perception outcomes. We evaluated the accuracy of clinical predictions of post-CI speech perception scores.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!