The amount of information contained in speech signals is a fundamental concern of speech-based technologies and is particularly relevant in speech perception. Measuring the mutual information of actual speech signals is non-trivial, and quantitative measurements have not been extensively conducted to date. Recent advancements in machine learning have made it possible to directly measure mutual information using data. This study utilized neural estimators of mutual information to estimate the information content in speech signals. The high-dimensional speech signal was divided into segments and then compressed using Mel-scale filter bank, which approximates the non-linear frequency perception of the human ear. The filter bank outputs were then truncated based on the dynamic range of the auditory system. This data compression preserved a significant amount of information from the original high-dimensional speech signal. The amount of information varied, depending on the categories of the speech sounds, with relatively higher mutual information in vowels compared to consonants. Furthermore, the information available in the speech signals, as processed by the auditory model, decreased as the dynamic range was reduced.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1121/10.0034854 | DOI Listing |
J Acoust Soc Am
January 2025
Department of Electronics Engineering, Pusan National University, Busan, South Korea.
The amount of information contained in speech signals is a fundamental concern of speech-based technologies and is particularly relevant in speech perception. Measuring the mutual information of actual speech signals is non-trivial, and quantitative measurements have not been extensively conducted to date. Recent advancements in machine learning have made it possible to directly measure mutual information using data.
View Article and Find Full Text PDFFront Mol Neurosci
January 2025
Department of Otorhinolaryngology, The Affiliated Changsha Central Hospital, Hengyang Medical School, University of South China, Changsha, China.
Auditory neuropathy spectrum disorder (ANSD) is an auditory dysfunction disorder characterized by impaired speech comprehension. Its etiology is complex and can be broadly categorized into genetic and non-genetic factors. mutation is identified as a causative factor in ANSD.
View Article and Find Full Text PDFFront Hum Neurosci
January 2025
Center for Ear-EEG, Department of Electrical and Computer Engineering, Aarhus University, Aarhus, Denmark.
The recent progress in auditory attention decoding (AAD) methods is based on algorithms that find a relation between the audio envelope and the neurophysiological response. The most popular approach is based on the reconstruction of the audio envelope from electroencephalogram (EEG) signals. These methods are primarily based on the exogenous response driven by the physical characteristics of the stimuli.
View Article and Find Full Text PDFInt J Audiol
January 2025
Department of Otorhinolaryngology and Head & Neck Surgery, Leiden University Medical Center, Leiden, Netherlands.
Objective: Measuring listening effort using pupillometry is challenging in cochlear implant (CI) users. We assess three validated speech tests (Matrix, LIST, and DIN) to identify the optimal speech material for measuring peak-pupil-dilation (PPD) in CI users as a function of signal-to-noise ratio (SNR).
Design: Speech tests were administered in quiet and two noisy conditions, namely at the speech recognition threshold (0 dB re SRT), i.
Mamm Genome
January 2025
The Gene Therapy Research Institute, CTNNB1 Foundation, Ljubljana, 1000, Slovenia.
CTNNB1 syndrome is a rare neurodevelopmental disorder, affecting children worldwide with a prevalence of 2.6-3.2 per 100,000 births and often misdiagnosed as cerebral palsy.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!