Exploring Practical Metrics to Support Automatic Speech Recognition Evaluations.

Stud Health Technol Inform

Yunjia Li, Habitat Learn, UK.

Published: August 2023

Recent studies into the evaluation of automatic speech recognition for its quality of output in the form of text have shown that using word error rate to see how many mistakes exist in English does not necessarily help the developer of automatic transcriptions or captions. Confidence levels as to the type of errors being made remain low because mistranslations from speech to text are not always captured with a note that details the reason for the error. There have been situations in higher education where students requiring captions and transcriptions have found that some academic lecture results are littered with word errors which means that comprehension levels drop and those with cognitive, physical and sensory disabilities are particularly affected. Despite the incredible improvements in general understanding of conversational automatic speech recognition, academic situations tend to include numerous domain specific terms and the lecturers may be non-native speakers, coping with recording technology in noisy situations. This paper aims to discuss the way additional metrics are used to capture issues and feedback into the machine learning process to enable enhanced quality of output and more inclusive practices for those using virtual conferencing systems. The process goes beyond what is expressed and examines paralinguistic aspects such as timing, intonation, voice quality and speech understanding.

Download full-text PDF

Source
http://dx.doi.org/10.3233/SHTI230636DOI Listing

Publication Analysis

Top Keywords

automatic speech
12
speech recognition
12
quality output
8
speech
5
exploring practical
4
practical metrics
4
metrics support
4
automatic
4
support automatic
4
recognition evaluations
4

Similar Publications

Purpose: The aim of this study was to gauge the impacts of cognitive empathy training experiential learning on traumatic brain injury (TBI) knowledge, awareness, confidence, and empathy in a pilot study of speech-language pathology graduate students.

Method: A descriptive quasi-experimental convergent parallel mixed methods design intervention pilot study (QUAL + QUANT) was conducted with a diverse convenience sample of 19 first- and second-year speech-language pathology graduate students who engaged in a half-day TBI point-of-view simulation. The simulation was co-constructed through a participatory design with those living with TBI based on Kolb's experiential learning model and followed the recommendations for point-of-view simulation ethics.

View Article and Find Full Text PDF

Background/objectives: Parkinson's disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal patterns in PD patients.

View Article and Find Full Text PDF

Purpose: This study aimed to assess the levels and sources of noise in the emergency intensive care unit (EICU) of an emergency department and investigate their effects on the sleep quality of conscious patients.

Methods: A study was conducted on patients admitted to the EICU from December 2020 to December 2023. They were categorised according to their sleep quality with the Pittsburgh Sleep Quality Index.

View Article and Find Full Text PDF

Aim: Patient-reported outcome measures (PROMs) are increasingly used in palliative care to evaluate patients' symptoms and conditions. Healthcare providers often collect PROMs through conversations. However, the manual entry of these data into electronic medical records can be burdensome for healthcare providers.

View Article and Find Full Text PDF

Multilingual Prediction of Cognitive Impairment with Large Language Models and Speech Analysis.

Brain Sci

December 2024

School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA 19104, USA.

Background: Cognitive impairment poses a significant global health challenge, emphasizing the critical need for early detection and intervention. Traditional diagnostics like neuroimaging and clinical evaluations are often subjective, costly, and inaccessible, especially in resource-poor settings. Previous research has focused on speech analysis primarily conducted using English data, leaving multilingual settings unexplored.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!