Before they even talk, infants become sensitive to the speech sounds of their native language and recognize the auditory form of an increasing number of words. Traditionally, these early perceptual changes are attributed to an emerging knowledge of linguistic categories such as phonemes or words. However, there is growing skepticism surrounding this interpretation due to limited evidence of category knowledge in infants.
View Article and Find Full Text PDFInfants learn their native language(s) at an amazing speed. Before they even talk, their perception adapts to the language(s) they hear. However, the mechanisms responsible for this perceptual attunement and the circumstances in which it takes place remain unclear.
View Article and Find Full Text PDFCold Spring Harb Mol Case Stud
December 2023
Diffuse large B-cell lymphoma (DLBCL) is a heterogenous group of lymphoid malignancies. Based on gene expression profiling, it has been subdivided into germinal center (GC)-derived and activated B-cell (ABC) types. Advances in molecular methodologies have further refined the subclassification of DLBCL, based on recurrent genetic abnormalities.
View Article and Find Full Text PDFThere is a current 'theory crisis' in language acquisition research, resulting from fragmentation both at the level of the approaches and the linguistic level studied. We identify a need for integrative approaches that go beyond these limitations, and propose to analyse the strengths and weaknesses of current theoretical approaches of language acquisition. In particular, we advocate that language learning simulations, if they integrate realistic input and multiple levels of language, have the potential to contribute significantly to our understanding of language acquisition.
View Article and Find Full Text PDFWe introduce Shennong, a Python toolbox and command-line utility for audio speech features extraction. It implements a wide range of well-established state-of-the-art algorithms: spectro-temporal filters such as Mel-Frequency Cepstral Filterbank or Predictive Linear Filters, pre-trained neural networks, pitch estimators, speaker normalization methods, and post-processing algorithms. Shennong is an open source, reliable and extensible framework built on top of the popular Kaldi speech processing library.
View Article and Find Full Text PDFJASA Express Lett
November 2021
This study aims to quantify the effect of several information sources: acoustic, higher-level linguistic, and knowledge of the prosodic system of the language, on the perception of prosodic boundaries. An experiment with native and non-native participants investigating the identification of prosodic boundaries in Japanese was conducted. It revealed that non-native speakers as well as native speakers with access only to acoustic information can recognize boundaries better than chance level.
View Article and Find Full Text PDFPatients with Huntington's disease suffer from disturbances in the perception of emotions; they do not correctly read the body, vocal and facial expressions of others. With regard to the expression of emotions, it has been shown that they are impaired in expressing emotions through face but up until now, little research has been conducted about their ability to express emotions through spoken language. To better understand emotion production in both voice and language in Huntington's Disease (HD), we tested 115 individuals: 68 patients (HD), 22 participants carrying the mutant HD gene without any motor symptoms (pre-manifest HD), and 25 controls in a single-centre prospective observational follow-up study.
View Article and Find Full Text PDFObjectives: Using brief samples of speech recordings, we aimed at predicting, through machine learning, the clinical performance in Huntington's Disease (HD), an inherited Neurodegenerative disease (NDD).
Methods: We collected and analyzed 126 samples of audio recordings of both forward and backward counting from 103 Huntington's disease gene carriers [87 manifest and 16 premanifest; mean age 50.6 (SD 11.
Early changes in infants' ability to perceive native and nonnative speech sound contrasts are typically attributed to their developing knowledge of phonetic categories. We critically examine this hypothesis and argue that there is little direct evidence of category knowledge in infancy. We then propose an alternative account in which infants' perception changes because they are learning a perceptual space that is appropriate to represent speech, without yet carving up that space into phonetic categories.
View Article and Find Full Text PDFInfants come to learn several hundreds of word forms by two years of age, and it is possible this involves carving these forms out from continuous speech. It has been proposed that the task is facilitated by the presence of prosodic boundaries. We revisit this claim by running computational models of word segmentation, with and without prosodic information, on a corpus of infant-directed speech.
View Article and Find Full Text PDFDeep learning models have become potential candidates for auditory neuroscience research, thanks to their recent successes in a variety of auditory tasks, yet these models often lack interpretability to fully understand the exact computations that have been performed. Here, we proposed a parametrized neural network layer, which computes specific spectro-temporal modulations based on Gabor filters [learnable spectro-temporal filters (STRFs)] and is fully interpretable. We evaluated this layer on speech activity detection, speaker verification, urban sound classification, and zebra finch call type classification.
View Article and Find Full Text PDFTheories and data on language acquisition suggest a range of cues are used, ranging from information on structure found in the linguistic signal itself, to information gleaned from the environmental context or through social interaction. We propose a blueprint for computational models of the early language learner (SCALa, for Socio-Computational Architecture of Language Acquisition) that makes explicit the connection between the kinds of information available to the social learner and the computational mechanisms required to extract language-relevant information and learn from it. SCALa integrates a range of views on language acquisition, further allowing us to make precise recommendations for future large-scale empirical research.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
September 2022
In order to reach human performance on complex visual tasks, artificial systems need to incorporate a significant amount of understanding of the world in terms of macroscopic objects, movements, forces, etc. Inspired by work on intuitive physics in infants, we propose an evaluation benchmark which diagnoses how much a given system understands about physics by testing whether it can tell apart well matched videos of possible versus impossible events constructed with a game engine. The test requires systems to compute a physical plausibility score over an entire video.
View Article and Find Full Text PDFA prominent hypothesis holds that by speaking to infants in infant-directed speech (IDS) as opposed to adult-directed speech (ADS), parents help them learn phonetic categories. Specifically, two characteristics of IDS have been claimed to facilitate learning: hyperarticulation, which makes the categories more separable, and variability, which makes the generalization more robust. Here, we test the separability and robustness of vowel category learning on acoustic representations of speech uttered by Japanese adults in ADS, IDS (addressed to 18- to 24-month olds), or read speech (RS).
View Article and Find Full Text PDFProc Natl Acad Sci U S A
March 2021
Words categorize the semantic fields they refer to in ways that maximize communication accuracy while minimizing complexity. Focusing on the well-studied color domain, we show that artificial neural networks trained with deep-learning techniques to play a discrimination game develop communication systems whose distribution on the accuracy/complexity plane closely matches that of human languages. The observed variation among emergent color-naming systems is explained by different degrees of discriminative need, of the sort that might also characterize different human communities.
View Article and Find Full Text PDFBefore they even speak, infants become attuned to the sounds of the language(s) they hear, processing native phonetic contrasts more easily than nonnative ones. For example, between 6 to 8 mo and 10 to 12 mo, infants learning American English get better at distinguishing English and [l], as in "rock" vs. "lock," relative to infants learning Japanese.
View Article and Find Full Text PDFThe genetic and molecular abnormalities underlying histological transformation (HT) of nodal marginal zone lymphoma (NMZL) to diffuse large B-cell lymphoma (DLBCL) are not well known. While del(20q12) is commonly deleted in myelodysplastic syndrome it has not previously been associated with DLBCL. We recently described a case of DLBCL harboring del(20q12) in a patient with a history of MZL involving lymph nodes and skin.
View Article and Find Full Text PDFPrevious computational modeling suggests it is much easier to segment words from child-directed speech (CDS) than adult-directed speech (ADS). However, this conclusion is based on data collected in the laboratory, with CDS from play sessions and ADS between a parent and an experimenter, which may not be representative of ecologically collected CDS and ADS. Fully naturalistic ADS and CDS collected with a nonintrusive recording device as the child went about her day were analyzed with a diverse set of algorithms.
View Article and Find Full Text PDFA basic task in first language acquisition likely involves discovering the boundaries between words or morphemes in input where these basic units are not overtly segmented. A number of unsupervised learning algorithms have been proposed in the last 20 years for these purposes, some of which have been implemented computationally, but whose results remain difficult to compare across papers. We created a tool that is open source, enables reproducible results, and encourages cumulative science in this domain.
View Article and Find Full Text PDFAphasia is a devastating brain disorder, detrimental for medical care and social interaction. The early diagnosis of language disorders and accurate identification of patient-specific deficits are crucial for patients' care, as aphasia rehabilitation is more effective when focused on patient-specific language deficits. We developed the Core Assessment of Language Processing (CALAP), a new scale combining screening and detailed evaluation to rapidly diagnose and identify patient-specific language deficits.
View Article and Find Full Text PDFTheories of cross-linguistic phonetic category perception posit that listeners perceive foreign sounds by mapping them onto their native phonetic categories, but, until now, no way to effectively implement this mapping has been proposed. In this paper, Automatic Speech Recognition systems trained on continuous speech corpora are used to provide a fully specified mapping between foreign sounds and native categories. The authors show how the machine ABX evaluation method can be used to compare predictions from the resulting quantitative models with empirically attested effects in human cross-linguistic phonetic category perception.
View Article and Find Full Text PDFWe investigate whether infant-directed speech (IDS) could facilitate word form learning when compared to adult-directed speech (ADS). To study this, we examine the distribution of word forms at two levels, acoustic and phonological, using a large database of spontaneous speech in Japanese. At the acoustic level we show that, as has been documented before for phonemes, the realizations of words are more variable and less discriminable in IDS than in ADS.
View Article and Find Full Text PDFSpectacular progress in the information processing sciences (machine learning, wearable sensors) promises to revolutionize the study of cognitive development. Here, we analyse the conditions under which 'reverse engineering' language development, i.e.
View Article and Find Full Text PDFThis article provides an estimation of how frequently, and from whom, children aged 0-11 years (Ns between 9 and 24) receive one-on-one verbal input among Tsimane forager-horticulturalists of lowland Bolivia. Analyses of systematic daytime behavioral observations reveal < 1 min per daylight hour is spent talking to children younger than 4 years of age, which is 4 times less than estimates for others present at the same time and place. Adults provide a majority of the input at 0-3 years of age but not afterward.
View Article and Find Full Text PDFThis study aims to quantify the relative contributions of phonetic categories and acoustic detail on phonotactically induced perceptual vowel epenthesis in Japanese listeners. A vowel identification task tested whether a vowel was perceived within illegal consonant clusters and, if so, which vowel was heard. Cross-spliced stimuli were used in which vowel coarticulation present in the cluster did not match the quality of the flanking vowel.
View Article and Find Full Text PDF