Multi-LEX: A database of multi-word frequencies for French and English.

Behav Res Methods

Laboratoire de Psychologie Cognitive (UMR7290), CNRS & Aix-Marseille Université Case D, 3, place Victor HUGO, 13331, Marseille Cedex 3, France.

Published: December 2023

Written word frequency is a key variable used in many psycholinguistic studies and is central in explaining visual word recognition. Indeed, methodological advances on single-word frequency estimates have helped to uncover novel language-related cognitive processes, fostering new ideas and studies. In an attempt to support and promote research on a related emerging topic, visual multi-word recognition, we extracted from the exhaustive Google Ngram datasets a selection of millions of multi-word sequences and computed their associated frequency estimate. Such sequences are presented with part-of-speech information for each individual word. An online behavioral investigation making use of the French 4-gram lexicon in a grammatical decision task was carried out. The results show an item-level frequency effect of word sequences. Moreover, the proposed datasets were found useful during the stimulus selection phase, allowing more precise control of the multi-word characteristics.

Download full-text PDF

Source
http://dx.doi.org/10.3758/s13428-022-02018-9DOI Listing

Publication Analysis

Top Keywords

multi-lex database
4
multi-word
4
database multi-word
4
multi-word frequencies
4
frequencies french
4
french english
4
english written
4
word
4
written word
4
frequency
4

Similar Publications

Multi-LEX: A database of multi-word frequencies for French and English.

Behav Res Methods

December 2023

Laboratoire de Psychologie Cognitive (UMR7290), CNRS & Aix-Marseille Université Case D, 3, place Victor HUGO, 13331, Marseille Cedex 3, France.

Written word frequency is a key variable used in many psycholinguistic studies and is central in explaining visual word recognition. Indeed, methodological advances on single-word frequency estimates have helped to uncover novel language-related cognitive processes, fostering new ideas and studies. In an attempt to support and promote research on a related emerging topic, visual multi-word recognition, we extracted from the exhaustive Google Ngram datasets a selection of millions of multi-word sequences and computed their associated frequency estimate.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!