Natural and artificial audition can in principle acquire different solutions to a given problem. The constraints of the task, however, can nudge the cognitive science and engineering of audition to qualitatively converge, suggesting that a closer mutual examination would potentially enrich artificial hearing systems and process models of the mind and brain. Speech recognition - an area ripe for such exploration - is inherently robust in humans to a number transformations at various spectrotemporal granularities. To what extent are these robustness profiles accounted for by high-performing neural network systems? We bring together experiments in speech recognition under a single synthesis framework to evaluate state-of-the-art neural networks as stimulus-computable, optimized observers. In a series of experiments, we (1) clarify how influential speech manipulations in the literature relate to each other and to natural speech, (2) show the granularities at which machines exhibit out-of-distribution robustness, reproducing classical perceptual phenomena in humans, (3) identify the specific conditions where model predictions of human performance differ, and (4) demonstrate a crucial failure of all artificial systems to perceptually recover where humans do, suggesting alternative directions for theory and model building. These findings encourage a tighter synergy between the cognitive science and engineering of audition.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.neunet.2023.02.032 | DOI Listing |
Vestn Oftalmol
December 2024
N.A. Semashko National Research Institute of Public Health, Moscow, Russia.
This article presents a creative biography of Sergey Selivanovich Golovin, the prominent Russian ophthalmologist of the first quarter of the 20th century. The work is based on archival research and analysis of published materials, and characterizes the career of S.S.
View Article and Find Full Text PDFAm J Otolaryngol
December 2024
Department of Otorhinolaryngology Head and Neck Surgery, Tianjin First Central Hospital, Tianjin 300192, China; Institute of Otolaryngology of Tianjin, Tianjin, China; Key Laboratory of Auditory Speech and Balance Medicine, Tianjin, China; Key Clinical Discipline of Tianjin (Otolaryngology), Tianjin, China; Otolaryngology Clinical Quality Control Centre, Tianjin, China.
Purpose: To use deep learning technology to design and implement a model that can automatically classify laryngoscope images and assist doctors in diagnosing laryngeal diseases.
Materials And Methods: The experiment was based on 3057 images (normal, glottic cancer, granuloma, Reinke's Edema, vocal cord cyst, leukoplakia, nodules and polyps) from the dataset Laryngoscope8. A classification model based on deep neural networks was developed and tested.
Vestn Otorinolaringol
December 2024
St. Petersburg Research Institute of Ear, Throat, Nose and Speech, St. Petersburg, Russia.
Unlabelled: Central auditory disorders (CSD) - this is a violation of the processing of sound stimuli, including speech, above the cochlear nuclei of the brain stem, which is mainly manifested by difficulties in speech recognition, especially in noisy environments. Children with this pathology are more likely to have behavioral problems, impaired auditory, linguistic and cognitive development, and especially difficulties with learning at school.
Objective: To analyze the literature data on the epidemiology of central auditory disorders in school-age children.
Audiol Res
December 2024
Division of Audiology, Department of Otolaryngology-Head and Neck Surgery, Mayo Clinic, Rochester, MN 55902, USA.
Background/objectives: Adult hearing-impaired patients qualifying for cochlear implants typically exhibit less than 60% sentence recognition under the best hearing aid conditions, either in quiet or noisy environments, with speech and noise presented through a single speaker. This study examines the influence of deep neural network-based (DNN-based) noise reduction on cochlear implant evaluation.
Methods: Speech perception was assessed using AzBio sentences in both quiet and noisy conditions (multi-talker babble) at 5 and 10 dB signal-to-noise ratios (SNRs) through one loudspeaker.
Audiol Res
December 2024
Audiology, Primary Care Department, AUSL of Modena, 41100 Modena, Italy.
: Hearing loss is a highly prevalent condition in the world population that determines emotional, social, and economic costs. In recent years, it has been definitely recognized that the lack of physiological binaural hearing causes alterations in the localization of sounds and reduced speech recognition in noise and reverberation. This study aims to explore the psycho-social profile of adult workers affected by single-sided deafness (SSD), without other major medical conditions and otological symptoms, through comparison to subjects with normal hearing.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!