Publications by authors named "Baayen R"

Unlabelled: Using distributional semantics, we show that English nominal pluralization exhibits semantic clusters. For instance, the change in semantic space from singulars to plurals differs depending on whether a word denotes, e.g.

View Article and Find Full Text PDF

Word frequency is a strong predictor in most lexical processing tasks. Thus, any model of word recognition needs to account for how word frequency effects arise. The Discriminative Lexicon Model (DLM) models lexical processing with mappings between words' forms and their meanings.

View Article and Find Full Text PDF

Grounding language in vision is an active field of research seeking to construct cognitively plausible word and sentence representations by incorporating perceptual knowledge from vision into text-based representations. Despite many attempts at language grounding, achieving an optimal equilibrium between textual representations of the language and our embodied experiences remains an open field. Some common concerns are the following.

View Article and Find Full Text PDF

Trial-to-trial effects have been found in a number of studies, indicating that processing a stimulus influences responses in subsequent trials. A special case are priming effects which have been modelled successfully with error-driven learning (Marsolek, 2008), implying that participants are continuously learning during experiments. This study investigates whether trial-to-trial learning can be detected in an unprimed lexical decision experiment.

View Article and Find Full Text PDF

This study addresses a series of methodological questions that arise when modeling inflectional morphology with Linear Discriminative Learning. Taking the semi-productive German noun system as example, we illustrate how decisions made about the representation of form and meaning influence model performance. We clarify that for modeling frequency effects in learning, it is essential to make use of incremental learning rather than the end-state of learning.

View Article and Find Full Text PDF

Pseudowords have long served as key tools in psycholinguistic investigations of the lexicon. A common assumption underlying the use of pseudowords is that they are devoid of meaning: Comparing words and pseudowords may then shed light on how meaningful linguistic elements are processed differently from meaningless sound strings. However, pseudowords may in fact carry meaning.

View Article and Find Full Text PDF

Using computational simulations, this work demonstrates that it is possible to learn a systematic relation between words' sound and their meanings. The sound-meaning relation was learned from a corpus of phonologically transcribed child-directed speech by using the linear discriminative learning (LDL) framework (Baayen, Chuang, Shafaei-Bajestan, & Blevins, 2019), which implements linear mappings between words' form vectors and semantic vectors. Presented with the form vectors of 16 nonwords, taken from a study on word learning (Fitneva, Christiansen, & Monaghan, 2009), the network generated the estimated semantic vectors of the nonwords.

View Article and Find Full Text PDF

This article provides a tutorial for analyzing pupillometric data. Pupil dilation has become increasingly popular in psychological and psycholinguistic research as a measure to trace language processing. However, there is no general consensus about procedures to analyze the data, with most studies analyzing extracted features from the pupil dilation data instead of analyzing the pupil dilation trajectories directly.

View Article and Find Full Text PDF

We present the Chinese Lexical Database (CLD): a large-scale lexical database for simplified Chinese. The CLD provides a wealth of lexical information for 3913 one-character words, 34,233 two-character words, 7143 three-character words, and 3355 four-character words, and is publicly available through http://www.chineselexicaldatabase.

View Article and Find Full Text PDF

Estonian is a morphologically rich Finno-Ugric language with nominal paradigms that have at least 28 different inflected forms but sometimes more than 40. For languages with rich inflection, it has been argued that whole-word frequency, as a diagnostic of whole-word representations, should not be predictive for lexical processing. We report a lexical decision experiment, showing that response latencies decrease both with frequency of the inflected form and its inflectional paradigm size.

View Article and Find Full Text PDF

Sound units play a pivotal role in cognitive models of auditory comprehension. The general consensus is that during perception listeners break down speech into auditory words and subsequently phones. Indeed, cognitive speech recognition is typically taken to be computationally intractable without phones.

View Article and Find Full Text PDF

The goal of the present study is to understand the role orthographic and semantic information play in the behavior of skilled readers. Reading latencies from a self-paced sentence reading experiment in which Russian near-synonymous verbs were manipulated appear well-predicted by a combination of bottom-up sublexical letter triplets (trigraphs) and top-down semantic generalizations, modeled using the Naive Discrimination Learner. The results reveal a complex interplay of bottom-up and top-down support from orthography and semantics to the target verbs, whereby activations from orthography only are modulated by individual differences.

View Article and Find Full Text PDF

Corpus surveys have shown that the exact forms with which idioms are realized are subject to variation. We report a rating experiment showing that such alternative realizations have varying degrees of acceptability. Idiom variation challenges processing theories associating idioms with fixed multi-word form units (Bobrow & Bell, 1973), fixed configurations of words (Cacciari & Tabossi, 1988), or fixed superlemmas (Sprenger, Levelt, & Kempen, 2006), as they do not explain how it can be that speakers produce variant forms that listeners can still make sense of.

View Article and Find Full Text PDF

In this study we present a novel set of discrimination-based indicators of language processing derived from Naive Discriminative Learning (ndl) theory. We compare the effectiveness of these new measures with classical lexical-distributional measures-in particular, frequency counts and form similarity measures-to predict lexical decision latencies when a complete morphological segmentation of masked primes is or is not possible. Data derive from a re-analysis of a large subset of decision latencies from the English Lexicon Project, as well as from the results of two new masked priming studies.

View Article and Find Full Text PDF

The processing of English noun-noun compounds (NNCs) was investigated to identify the extent and nature of differences between the performance of native speakers of English and advanced Spanish and German non-native speakers of English. The study sought to establish whether the word order of the equivalent structure in the non-native speakers' mothertongue (L1) had an influence on their processing of NNCs in their second language (L2), and whether this influence was due to differences in grammatical representation (i.e.

View Article and Find Full Text PDF

We considered the role of orthography and task-related processing mechanisms in the activation of morphologically related complex words during bilingual word processing. So far, it has only been shown that such morphologically related words (i.e.

View Article and Find Full Text PDF

Arnon and Snider ((2010). More than words: Frequency effects for multi-word phrases. Journal of memory and language, 62, 67-82) documented frequency effects for compositional four-grams independently of the frequencies of lower-order n-grams.

View Article and Find Full Text PDF
Article Synopsis
  • The study introduces a new method, using naive discriminative learning (NDL), to measure pronunciation distances, which can be important for various areas of linguistics.
  • NDL offers a more advanced and flexible approach compared to the traditional Levenshtein algorithm, as it is informed by cognitive theories and can produce asymmetrical distances.
  • Validation efforts showed that NDL distances correlated well with native speakers' perceptions of accented speech, achieving correlation rates between 0.7 and 0.8, and can also integrate additional acoustic features beyond just sound segments.
View Article and Find Full Text PDF

Dutch children, from the second and fourth grade of primary school, were each given a visual lexical decision test on 210 Dutch monomorphemic words. After removing words not recognized by a majority of the younger group, (lexical) decisions were analysed by mixed-model regression methods to see whether morphological Family Size influenced decision times over and above several other covariates. The effect of morphological Family Size on decision time was mixed: larger families led to significantly faster decision times for the second graders but not for the fourth graders.

View Article and Find Full Text PDF

Two auditory lexical decision experiments document for morphologically complex words two points at which the probability of a target word given the evidence shifts dramatically. The first point is reached when morphologically unrelated competitors are no longer compatible with the evidence. Adapting terminology from Marslen-Wilson (1984), we refer to this as the word's initial uniqueness point (UP1).

View Article and Find Full Text PDF

Visual emotionally charged stimuli have been shown to elicit early electrophysiological responses (e.g., Ihssen, Heim, & Keil, 2007; Schupp, Junghöfer, Weike, & Hamm, 2003; Stolarova, Keil, & Moratti, 2006).

View Article and Find Full Text PDF

Inserting a hyphen in Dutch and Finnish compounds is most often illegal given spelling conventions. However, the current two eye movement experiments on triconstituent Dutch compounds like voetbalbond"footballassociation" (Experiment 1) and triconstituent Finnish compounds like lentokenttätaksi"airporttaxi" (Experiment 2) show that inserting a hyphen at constituent boundaries does not have to be detrimental to compound processing. In fact, when hyphens were inserted at the major constituent boundary (voetbal-bond"football-association"; lentokenttä-taksi"airport-taxi"), processing of the first part (voetbal"football"; lentokenttä"airport") turns out to be faster when it is followed by a hyphen than when it is legally concatenated.

View Article and Find Full Text PDF

In this study we examine linguistic variation and its dependence on both social and geographic factors. We follow dialectometry in applying a quantitative methodology and focusing on dialect distances, and social dialectology in the choice of factors we examine in building a model to predict word pronunciation distances from the standard Dutch language to 424 Dutch dialects. We combine linear mixed-effects regression modeling with generalized additive modeling to predict the pronunciation distance of 559 words.

View Article and Find Full Text PDF

A 2-layer symbolic network model based on the equilibrium equations of the Rescorla-Wagner model (Danks, 2003) is proposed. The study first presents 2 experiments in Serbian, which reveal for sentential reading the inflectional paradigmatic effects previously observed by Milin, Filipović Đurđević, and Moscoso del Prado Martín (2009) for unprimed lexical decision. The empirical results are successfully modeled without having to assume separate representations for inflections or data structures such as inflectional paradigms.

View Article and Find Full Text PDF

We constructed a two-locus database, comprising partial translation elongation factor (EF-1alpha) gene sequences and nearly full-length sequences of the nuclear ribosomal intergenic spacer region (IGS rDNA) for 850 isolates spanning the phylogenetic breadth of the Fusarium oxysporum species complex (FOSC). Of the 850 isolates typed, 101 EF-1alpha, 203 IGS rDNA, and 256 two-locus sequence types (STs) were differentiated. Analysis of the combined dataset suggests that two-thirds of the STs might be associated with a single host plant.

View Article and Find Full Text PDF