Determining the extent to which the perceptual world can be recovered from language is a longstanding problem in philosophy and cognitive science. We show that state-of-the-art large language models can unlock new insights into this problem by providing a lower bound on the amount of perceptual information that can be extracted from language. Specifically, we elicit pairwise similarity judgments from GPT models across six psychophysical datasets. We show that the judgments are significantly correlated with human data across all domains, recovering well-known representations like the color wheel and pitch spiral. Surprisingly, we find that a model (GPT-4) co-trained on vision and language does not necessarily lead to improvements specific to the visual modality, and provides highly correlated predictions with human data irrespective of whether direct visual input is provided or purely textual descriptors. To study the impact of specific languages, we also apply the models to a multilingual color-naming task. We find that GPT-4 replicates cross-linguistic variation in English and Russian illuminating the interaction of language and perception.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11399123PMC
http://dx.doi.org/10.1038/s41598-024-72071-1DOI Listing

Publication Analysis

Top Keywords

large language
8
language models
8
human data
8
language
5
models
4
models predict
4
predict human
4
human sensory
4
sensory judgments
4
judgments modalities
4

Similar Publications

Background: Head and neck cancer (HNC) is amongst the 10 most common cancers worldwide and has a major effect on patients' quality of life. Given the complexity of this unique group of patients, a multidisciplinary team approach is preferable. Amongst the debilitating sequels of HNC and/or its treatment, swallowing, speech and voice impairments are prevalent and require the involvement of speech-language pathologists (SLPs).

View Article and Find Full Text PDF

Moving beyond word frequency based on tally counting: AI-generated familiarity estimates of words and phrases are an interesting additional index of language knowledge.

Behav Res Methods

December 2024

ETSI de Telecomunicación, Universidad Politécnica de Madrid, Avenida Complutense, 30, 28040, Madrid, Spain.

This study investigates the potential of large language models (LLMs) to estimate the familiarity of words and multi-word expressions (MWEs). We validated LLM estimates for isolated words using existing human familiarity ratings and found strong correlations. LLM familiarity estimates performed even better in predicting lexical decision and naming performance in megastudies than the best available word frequency measures.

View Article and Find Full Text PDF

Despite being largely spoken and studied by language and cognitive scientists, Italian lacks large resources of language processing data. The Italian Crowdsourcing Project (ICP) is a dataset of word recognition times and accuracy including responses to 130,465 words, which makes it the largest dataset of its kind item-wise. The data were collected in an online word knowledge task in which over 156,000 native speakers of Italian took part.

View Article and Find Full Text PDF

Background: While large language models like ChatGPT-4 have demonstrated competency in English, their performance for minority groups speaking underrepresented languages, as well as their ability to adapt to specific socio-cultural nuances and regional cuisines, such as those in Central Asia (e.g., Kazakhstan), still requires further investigation.

View Article and Find Full Text PDF

Background: The number of emergency department (ED) visits has been on steady increase globally. Artificial Intelligence (AI) technologies, including Large Language Model (LLMs)-based generative AI models, have shown promise in improving triage accuracy. This study evaluates the performance of ChatGPT and Copilot in triage at a high-volume urban hospital, hypothesizing that these tools can match trained physicians' accuracy and reduce human bias amidst ED crowding challenges.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!