Speech recognition in adverse conditions by humans and machines.

JASA Express Lett

Department of Computational Linguistics, University of Zurich, Andreasstrasse 15, Zurich 8050,

Published: November 2024

In the development of automatic speech recognition systems, achieving human-like performance has been a long-held goal. Recent releases of large spoken language models have claimed to achieve such performance, although direct comparison to humans has been severely limited. The present study tested L1 British English listeners against two automatic speech recognition systems (wav2vec 2.0 and Whisper, base and large sizes) in adverse listening conditions: speech-shaped noise and pub noise, at different signal-to-noise ratios, and recordings produced with or without face masks. Humans maintained the advantage against all systems, except for Whisper large, which outperformed humans in every condition but pub noise.

Download full-text PDF

Source
http://dx.doi.org/10.1121/10.0032473DOI Listing

Publication Analysis

Top Keywords

speech recognition
12
automatic speech
8
recognition systems
8
pub noise
8
recognition adverse
4
adverse conditions
4
humans
4
conditions humans
4
humans machines
4
machines development
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!