Voice or speaker recognition is critical in a wide variety of social contexts. In this study, we investigated the contributions of acoustic, phonological, lexical, and semantic information toward voice recognition. Native English speaking participants were trained to recognize five speakers in five conditions: non-speech, Mandarin, German, pseudo-English, and English. We showed that voice recognition significantly improved as more information became available, from purely acoustic features in non-speech to additional phonological information varying in familiarity. Moreover, we found that the recognition performance is transferable between training and testing in phonologically familiar conditions (German, pseudo-English, and English), but not in unfamiliar (Mandarin) or non-speech conditions. These results provide evidence suggesting that bottom-up acoustic analysis and top-down influence from phonological processing collaboratively govern voice recognition.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4473599 | PMC |
http://dx.doi.org/10.1038/srep11475 | DOI Listing |
Sensors (Basel)
January 2025
SHCCIG Yubei Coal Industry Co., Ltd., Xi'an 710900, China.
The coal mining industry in Northern Shaanxi is robust, with a prevalent use of the local dialect, known as "Shapu", characterized by a distinct Northern Shaanxi accent. This study addresses the practical need for speech recognition in this dialect. We propose an end-to-end speech recognition model for the North Shaanxi dialect, leveraging the Conformer architecture.
View Article and Find Full Text PDFJ Acoust Soc Am
January 2025
USC Viterbi School of Engineering, University of Southern California, Los Angeles, California 90089-1455, USA.
Voice quality serves as a rich source of information about speakers, providing listeners with impressions of identity, emotional state, age, sex, reproductive fitness, and other biologically and socially salient characteristics. Understanding how this information is transmitted, accessed, and exploited requires knowledge of the psychoacoustic dimensions along which voices vary, an area that remains largely unexplored. Recent studies of English speakers have shown that two factors related to speaker size and arousal consistently emerge as the most important determinants of quality, regardless of who is speaking.
View Article and Find Full Text PDFEval Program Plann
January 2025
Past Chair of The Coalition for Juvenile Justice, USA. Electronic address:
Growing recognition of the serious consequences of racism in the United States encourage organizations and agencies to move toward an anti-racist approach. Implicit racial bias trainings are one piece of the approach toward this end. While many such trainings are being conducted nationwide, it remains unclear what practices best encourage participation, engagement, and desired outcomes following implicit racial bias trainings.
View Article and Find Full Text PDFPeerJ
January 2025
Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong, Hong Kong SAR, China.
Recognition plays a key role in the social lives of gregarious species, enabling animals to distinguish among social partners and tailor their behaviour accordingly. As domesticated animals regularly interact with humans, as well as members of their own species, we might expect mechanisms used to discriminate between conspecifics to also apply to humans. Given that goats can combine visual and vocal cues to recognise one another, we investigated whether this cross-modal recognition extends to discriminating among familiar humans.
View Article and Find Full Text PDFLight Sci Appl
January 2025
Spin-Optics laboratory, St. Petersburg State University, St. Petersburg, 198504, Russia.
We introduce a novel neuromorphic network architecture based on a lattice of exciton-polariton condensates, intricately interconnected and energized through nonresonant optical pumping. The network employs a binary framework, where each neuron, facilitated by the spatial coherence of pairwise coupled condensates, performs binary operations. This coherence, emerging from the ballistic propagation of polaritons, ensures efficient, network-wide communication.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!