We use the Google Ngram database, a corpus of 5,195,769 digitized books containing ~4% of all books ever published, to test three ideas that are hypothesized to account for linguistic generalizations: verbal semantics, pre-emption and skew. Using 828,813 tokens of un-forms as a test case for these mechanisms, we found verbal semantics was a good predictor of the frequency of un-forms in the English language over the past 200 years-both in terms of how the frequency changed over time and their frequency rank. We did not find strong evidence for the direct competition of un-forms and their top pre-emptors, however the skew of the un-construction competitors was inversely correlated with the acceptability of the un-form. We suggest a cognitive explanation for this, namely, that the more the set of relevant pre-emptors is skewed then the more easily it is retrieved from memory. This suggests that it is not just the frequency of pre-emptive forms that must be taken into account when trying to explain usage patterns but their skew as well.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3872292 | PMC |
http://dx.doi.org/10.3389/fpsyg.2013.00989 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!