The principal components of meaning, revisited.

Psychon Bull Rev

Department of Psychology, University of Alberta, P220 Biological Sciences Building, Edmonton, AB, T6G 2E9, Canada.

Published: August 2024

AI Article Synopsis

  • Osgood, Suci, and Tannebaum pioneered the study of semantics by reducing a high-dimensional model based on human judgments of word relationships, leading to the exploration of word-embedding models that analyze these relationships through patterns.
  • Hollis and Westbury identified the first eight principal components of word embeddings by correlating them with various lexical measures, revealing distinctions in the interpretation of these components.
  • A study extends this analysis by applying qualitative and quantitative methods to predict principal component values, finding that certain semantic and word class measures are effective indicators across different word-embedding datasets, supporting Wittgenstein's view on the social basis of linguistic meaning.

Article Abstract

Osgood, Suci, and Tannebaum were the first to attempt to identify the principal components of semantics using dimensional reduction of a high-dimensional model of semantics constructed from human judgments of word relatedness. Modern word-embedding models analyze patterns of words to construct higher dimensional models of semantics that can be similarly subjected to dimensional reduction. Hollis and Westbury characterized the first eight principal components (PCs) of a word-embedding model by correlating them with several well-known lexical measures, such as logged word frequency, age of acquisition, valence, arousal, dominance, and concreteness. The results show some clear differentiation of interpretation between the PCs. Here, we extend this work by analyzing a larger word-embedding matrix using semantic measures initially derived from subjective inspection of the PCs. We then use quantitative analysis to confirm the utility of these subjective measures for predicting PC values and cross-validate them on two word-embedding matrices developed on distinct corpora. Several semantic and word class measures are strongly predictive of early PC values, including first-person and second-person verbs, personal relevance of abstract and concrete words, affect terms, and names of places and people. The predictors of the lowest magnitude PCs generalized well to word-embedding matrices constructed from separate corpora, including matrices constructed using different word-embedding methods. The predictive categories we describe are consistent with Wittgenstein's argument that an autonomous level of social interaction grounds linguistic meaning.

Download full-text PDF

Source
http://dx.doi.org/10.3758/s13423-024-02551-yDOI Listing

Publication Analysis

Top Keywords

principal components
12
dimensional reduction
8
word-embedding matrices
8
matrices constructed
8
word-embedding
6
components meaning
4
meaning revisited
4
revisited osgood
4
osgood suci
4
suci tannebaum
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!