The human voice stands out for its rich information transmission capabilities. However, voice communication is susceptible to interference from noisy environments and obstacles. Here, we propose a wearable wireless flexible skin-attached acoustic sensor (SAAS) capable of capturing the vibrations of vocal organs and skin movements, thereby enabling voice recognition and human-machine interaction (HMI) in harsh acoustic environments. This system utilizes a piezoelectric micromachined ultrasonic transducers (PMUT), which feature high sensitivity (-198 dB), wide bandwidth (10 Hz-20 kHz), and excellent flatness (±0.5 dB). Flexible packaging enhances comfort and adaptability during wear, while integration with the Residual Network (ResNet) architecture significantly improves the classification of laryngeal speech features, achieving an accuracy exceeding 96%. Furthermore, we also demonstrated SAAS's data collection and intelligent classification capabilities in multiple HMI scenarios. Finally, the speech recognition system was able to recognize everyday sentences spoken by participants with an accuracy of 99.8% through a deep learning model. With advantages including a simple fabrication process, stable performance, easy integration, and low cost, SAAS presents a compelling solution for applications in voice control, HMI, and wearable electronics.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1038/s41467-025-57629-5 | DOI Listing |
J Med Internet Res
March 2025
Westmead Applied Research Centre, Faculty of Medicine and Health, The University of Sydney, Westmead, Australia.
Background: Conversational artificial intelligence (AI) allows for engaging interactions, however, its acceptability, barriers, and enablers to support patients with atrial fibrillation (AF) are unknown.
Objective: This work stems from the Coordinating Health care with AI-supported Technology for patients with AF (CHAT-AF) trial and aims to explore patient perspectives on receiving support from a conversational AI support program.
Methods: Patients with AF recruited for a randomized controlled trial who received the intervention were approached for semistructured interviews using purposive sampling.
IEEE Trans Vis Comput Graph
March 2025
Trust in agents within Virtual Reality is becoming increasingly important, as they provide advice and influence people's decision-making. However, previous studies show that encountering speech recognition errors can reduce users' trust in agents. Such errors lead users to ignore the agent's advice and make suboptimal decisions.
View Article and Find Full Text PDFBrain Inj
March 2025
Interdisciplinary Health Sciences & Sociology, Oakland University, Rochester, Minnesota, USA.
Objective: To synthesize requirements and recommendations addressing sport-related concussion (SRC).
Design: Qualitative study.
Setting: Scholastic and non-scholastic athletic programs.
Sci Rep
March 2025
Basque Center on Cognition, Brain and Language, Paseo Mikeletegi 69, Donostia-San Sebastián, 20009, Spain.
Learning to read affects speech perception. For example, the ability of listeners to recognize consistently spelled words faster than inconsistently spelled words is a robust finding called the Orthographic Consistency Effect (OCE). Previous studies located the OCE at the rime level and focused on languages with opaque orthographies.
View Article and Find Full Text PDFCochlear Implants Int
March 2025
Department of Speech Language Pathology & Audiology, Towson University, Towson, MD, USA.
Objective: The objective of this study was to determine how the presentation of unprocessed speech, either ipsilaterally (to simulate electro-acoustic stimulation, EAS) or contralaterally (to simulate bimodal stimulation), alongside vocoder-processed speech affects the efficiency of spoken word processing.
Method: Gated word recognition was performed under four listening conditions: full-spectrum speech, vocoder-processed speech, electro-acoustic stimulation (EAS), and bimodal stimulation. In the EAS condition, low-frequency unprocessed speech and high-frequency vocoder-processed speech were presented to the same ear, while in the bimodal condition, full-spectrum speech was presented to one ear and vocoder-processed speech to the other.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!