The effects of different acoustic representations and normalizations were compared for classifiers predicting perception of children's rhotic versus derhotic /ɹ/. Formant and Mel frequency cepstral coefficient (MFCC) representations for 350 speakers were z-standardized, either relative to values in the same utterance or age-and-sex data for typical /ɹ/. Statistical modeling indicated age-and-sex normalization significantly increased classifier performances. Clinically interpretable formants performed similarly to MFCCs and were endorsed for deep neural network engineering, achieving mean test-participant-specific F1-score = 0.81 after personalization and replication (σx = 0.10, med = 0.83, n = 48). Shapley additive explanations analysis indicated the third formant most influenced fully rhotic predictions.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522988PMC
http://dx.doi.org/10.1121/10.0024632DOI Listing

Publication Analysis

Top Keywords

acoustic representations
8
evaluating acoustic
4
representations normalization
4
normalization rhoticity
4
rhoticity classification
4
classification children
4
children speech
4
speech sound
4
sound disorders
4
disorders effects
4

Similar Publications

Introduction: The study of attention has been pivotal in advancing our comprehension of cognition. The goal of this study is to investigate which EEG data representations or features are most closely linked to attention, and to what extent they can handle the cross-subject variability.

Methods: We explore the features obtained from the univariate time series from a single EEG channel, such as time domain features and recurrence plots, as well as representations obtained directly from the multivariate time series, such as global field power or functional brain networks.

View Article and Find Full Text PDF

Phantom perceptions like tinnitus occur without any identifiable environmental or bodily source. The mechanisms and key drivers behind tinnitus are poorly understood. The dominant framework, suggesting that tinnitus results from neural hyperactivity in the auditory pathway following hearing damage, has been difficult to investigate in humans and has reached explanatory limits.

View Article and Find Full Text PDF

Antarctic krill vertical migrations modulate seasonal carbon export.

Science

January 2025

Australian Antarctic Program Partnership, Institute for Marine and Antarctic Studies, University of Tasmania, nipaluna/Hobart, Tasmania, Australia.

Vertical migrations by marine organisms contribute to carbon export by consumption of surface phytoplankton followed by defecation in the deep ocean. However, biogeochemical models lack observational data, leading to oversimplified representation of carbon cycling by migrating organisms, such as Antarctic krill (). Using a numerical model informed by 1 year of acoustic observations in the East Antarctic, we estimated the total particulate organic carbon (POC) flux from krill fecal pellets to be 9.

View Article and Find Full Text PDF

Perceptual learning of modulation filtered speech.

J Exp Psychol Hum Percept Perform

January 2025

School of Psychology, University of Sussex.

Human listeners have a remarkable capacity to adapt to severe distortions of the speech signal. Previous work indicates that perceptual learning of degraded speech reflects changes to sublexical representations, though the precise format of these representations has not yet been established. Inspired by the neurophysiology of auditory cortex, we hypothesized that perceptual learning involves changes to perceptual representations that are tuned to acoustic modulations of the speech signal.

View Article and Find Full Text PDF

Mode-informed complex-valued neural processes for matched field processing.

J Acoust Soc Am

January 2025

School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, People's Republic of China.

A complex-valued neural process method, combined with modal depth functions (MDFs) of the ocean waveguide, is proposed to reconstruct the acoustic field. Neural networks are used to describe complex Gaussian processes, modeling the distribution of the acoustic field at different depths. The network parameters are optimized through a meta-learning strategy, preventing overfitting under small sample conditions (sample size equals the number of array elements) and mitigating the slow reconstruction speed of Gaussian processes (GPs), while denoising and interpolating sparsely distributed acoustic field data, generating dense field data for virtual receiver arrays.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!