Cochlea-inspired speech recognition interface.

Mladen Russo Maja Stella Marjan Sikora Matko Šarić

Med Biol Eng Comput

Laboratory for Smart Environment Technologies, FESB - University of Split, Split, Croatia.

Published: June 2019

Automatic speech recognition (ASR) technology provides a natural interface for human-machine interaction. Typical ASR systems can achieve high performance in quiet environments but, unlike humans, perform poorly in real-world situations. To better simulate the human auditory periphery and improve the performance in realistic noisy scenarios, we propose two models of speech recognition front-ends based on a biophysical cochlear model. The first front-end is based on the method of signal reconstruction from a basilar membrane response. When applied to noisy speech, this method results in improved signal quality. This method can be used as a preprocessing step in a standard ASR system and can also be used as a noise reduction technique for other applications. The second front-end we propose is based on the construction of speech recognition coefficients directly from a basilar membrane response. Experimental results using a continuous-density hidden Markov model (HMM) recognizer demonstrate significant improvement in performance compared to standard Mel-frequency cepstral coefficients (MFCC) in various types of noisy conditions. Graphical Abstract Speech recognition model based on cochlear front-end.

Download full-text PDF	Source
http://dx.doi.org/10.1007/s11517-019-01963-6	DOI Listing

Publication Analysis

Top Keywords

speech recognition

basilar membrane

membrane response

recognition

speech

cochlea-inspired speech

recognition interface

interface automatic

automatic speech

recognition asr

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!