Recent studies into the evaluation of automatic speech recognition for its quality of output in the form of text have shown that using word error rate to see how many mistakes exist in English does not necessarily help the developer of automatic transcriptions or captions. Confidence levels as to the type of errors being made remain low because mistranslations from speech to text are not always captured with a note that details the reason for the error. There have been situations in higher education where students requiring captions and transcriptions have found that some academic lecture results are littered with word errors which means that comprehension levels drop and those with cognitive, physical and sensory disabilities are particularly affected. Despite the incredible improvements in general understanding of conversational automatic speech recognition, academic situations tend to include numerous domain specific terms and the lecturers may be non-native speakers, coping with recording technology in noisy situations. This paper aims to discuss the way additional metrics are used to capture issues and feedback into the machine learning process to enable enhanced quality of output and more inclusive practices for those using virtual conferencing systems. The process goes beyond what is expressed and examines paralinguistic aspects such as timing, intonation, voice quality and speech understanding.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.3233/SHTI230636 | DOI Listing |
Am J Speech Lang Pathol
January 2025
Good Samaritan Medical Center Foundation, Lafayette, CO.
Purpose: The aim of this study was to gauge the impacts of cognitive empathy training experiential learning on traumatic brain injury (TBI) knowledge, awareness, confidence, and empathy in a pilot study of speech-language pathology graduate students.
Method: A descriptive quasi-experimental convergent parallel mixed methods design intervention pilot study (QUAL + QUANT) was conducted with a diverse convenience sample of 19 first- and second-year speech-language pathology graduate students who engaged in a half-day TBI point-of-view simulation. The simulation was co-constructed through a participatory design with those living with TBI based on Kolb's experiential learning model and followed the recommendations for point-of-view simulation ethics.
Diagnostics (Basel)
December 2024
GITA Lab., Faculty of Engineering, University of Antioquia, Medellín 050010, Colombia.
Background/objectives: Parkinson's disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal patterns in PD patients.
View Article and Find Full Text PDFNoise Health
January 2025
Department of EICU, Wenzhou Central Hospital; The Dingli Clinical College of Wenzhou Medical University, Wenzhou, Zhejiang Province, China.
Purpose: This study aimed to assess the levels and sources of noise in the emergency intensive care unit (EICU) of an emergency department and investigate their effects on the sleep quality of conscious patients.
Methods: A study was conducted on patients admitted to the EICU from December 2020 to December 2023. They were categorised according to their sleep quality with the Pittsburgh Sleep Quality Index.
Jpn J Nurs Sci
January 2025
Department of Palliative Nursing, Health Sciences, Tohoku University Graduate School of Medicine, Sendai, Miyagi, Japan.
Aim: Patient-reported outcome measures (PROMs) are increasingly used in palliative care to evaluate patients' symptoms and conditions. Healthcare providers often collect PROMs through conversations. However, the manual entry of these data into electronic medical records can be burdensome for healthcare providers.
View Article and Find Full Text PDFBrain Sci
December 2024
School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA 19104, USA.
Background: Cognitive impairment poses a significant global health challenge, emphasizing the critical need for early detection and intervention. Traditional diagnostics like neuroimaging and clinical evaluations are often subjective, costly, and inaccessible, especially in resource-poor settings. Previous research has focused on speech analysis primarily conducted using English data, leaving multilingual settings unexplored.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!