Acoustic compression in Zoom audio does not compromise voice recognition performance.

Sci Rep

Department of Computational Linguistics, University of Zurich, Andreasstrasse 15, 8050, Zurich, Switzerland.

Published: October 2023

Human voice recognition over telephone channels typically yields lower accuracy when compared to audio recorded in a studio environment with higher quality. Here, we investigated the extent to which audio in video conferencing, subject to various lossy compression mechanisms, affects human voice recognition performance. Voice recognition performance was tested in an old-new recognition task under three audio conditions (telephone, Zoom, studio) across all matched (familiarization and test with same audio condition) and mismatched combinations (familiarization and test with different audio conditions). Participants were familiarized with female voices presented in either studio-quality (N = 22), Zoom-quality (N = 21), or telephone-quality (N = 20) stimuli. Subsequently, all listeners performed an identical voice recognition test containing a balanced stimulus set from all three conditions. Results revealed that voice recognition performance (d') in Zoom audio was not significantly different to studio audio but both in Zoom and studio audio listeners performed significantly better compared to telephone audio. This suggests that signal processing of the speech codec used by Zoom provides equally relevant information in terms of voice recognition compared to studio audio. Interestingly, listeners familiarized with voices via Zoom audio showed a trend towards a better recognition performance in the test (p = 0.056) compared to listeners familiarized with studio audio. We discuss future directions according to which a possible advantage of Zoom audio for voice recognition might be related to some of the speech coding mechanisms used by Zoom.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10618539	PMC
http://dx.doi.org/10.1038/s41598-023-45971-x	DOI Listing

Publication Analysis

Top Keywords

voice recognition

recognition performance

zoom audio

studio audio

audio

recognition

zoom

voice

human voice

audio conditions

Similar Publications

A High Index of Awareness About the Inherent Complications of Thoracic Segmental Spinal Anesthesia: A Case of Mastectomy With Bronchiectasis Under Thoracic Segmental Spinal Anesthesia.

Cureus

December 2024

Urology, Datta Meghe Medical College, Datta Meghe Institute of Higher Education and Research (Deemed to be University), Nagpur, IND.

Kirti Gujarkar Mahatme Pratibha U Deshmukh Anjali Borkar Nandkishor J Bankar Prajwal Mahatme

General anesthesia is the gold standard for breast cancer surgeries. Considering the nature of the surgery and its associated pain, various regional techniques are used as an adjunct to general anesthesia. Regional anesthesia as a sole anesthetic technique for breast cancer surgery is an upcoming technique - especially in high-risk patients considering the risk-benefit ratio, various regional blocks like pectoralis major block, pectoralis minor block, and erector spinae block - in which thoracic segmental spinal anesthesia is the recent one.

View Article and Find Full Text PDF

Similar Publications

A Prototype "Smart" 3-Dimensionally Printed Model Showcasing Interactivity: Implementing Voice Command for the Ventricular and Cisternal Systems.

J Comput Assist Tomogr

January 2025

Department of Radiology, George Washington University Hospital, Washington, DC.

Cullen Fleming Navid Mostaghni Iman Elsayed Sabrina Hsiao Raheleh Taghvaei

The next step in the evolution of static 3-dimensionally (3D) printed models may be the creation of "smart" models, where subcomponents can be seamlessly interacted with through a feedback mechanism, with potential applications in trainee education and patient counseling. Considering the complexity of the ventricular and cisternal systems, they were chosen for segmentation, using Materialize InPrint with outward hollowing using 2.5-mm wall thickness.

View Article and Find Full Text PDF

Similar Publications

Pulmonary Embolism: Highlighting Iron Deficiency Anemia as a Contributing Factor in the Development of Pulmonary Embolism-A Case Report.

Clin Case Rep

February 2025

Department of Public Health Atish Dipankar University of Science and Technology Dhaka Bangladesh.

Fnu Yogeeta Muskan Devi Sameer Abdul Rauf Fnu Tooba Khadija Asif Sumar

This case emphasizes iron deficiency anemia (IDA) as a potential risk factor for pulmonary embolism (PE), especially in patients with type 2 diabetes. Early recognition and management of PE and IDA are crucial. Further research is needed to clarify the mechanisms linking IDA to thrombosis and improve prevention strategies.

View Article and Find Full Text PDF

Similar Publications

Polymer-Layered Optical Wearable (PLOW) for Healthcare Applications: Temperature and Stretching Monitoring.

ACS Appl Mater Interfaces

January 2025

Nanophotonics and Plasmonics Laboratory, School of Basic Sciences, Indian Institute of Technology Bhubaneswar, Bhubaneswar, Odisha 752050, India.

Pratik Mishra Devendra Nath Goswami Santosh Kumar Rajan Jha

Thermal and stretching characteristics are crucial variables in healthcare, robotics, and human-machine interaction applications. Here, we present a single-mode fiber-based, balloon-shaped, single- and dual polymer-layered optical wearable (PLOW) system that can sense both temperature and stretching. These two types of PLOWs are compared in terms of their detection performance across all criteria.

View Article and Find Full Text PDF

Similar Publications

An End-To-End Speech Recognition Model for the North Shaanxi Dialect: Design and Evaluation.

Sensors (Basel)

January 2025

SHCCIG Yubei Coal Industry Co., Ltd., Xi'an 710900, China.

Yi Qin Feifan Yu

The coal mining industry in Northern Shaanxi is robust, with a prevalent use of the local dialect, known as "Shapu", characterized by a distinct Northern Shaanxi accent. This study addresses the practical need for speech recognition in this dialect. We propose an end-to-end speech recognition model for the North Shaanxi dialect, leveraging the Conformer architecture.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!