When processing language, the brain is thought to deploy specialized computations to construct meaning from complex linguistic structures. Recently, artificial neural networks based on the Transformer architecture have revolutionized the field of natural language processing. Transformers integrate contextual information across words via structured circuit computations. Prior work has focused on the internal representations ("embeddings") generated by these circuits. In this paper, we instead analyze the circuit computations directly: we deconstruct these computations into the functionally-specialized "transformations" that integrate contextual information across words. Using functional MRI data acquired while participants listened to naturalistic stories, we first verify that the transformations account for considerable variance in brain activity across the cortical language network. We then demonstrate that the emergent computations performed by individual, functionally-specialized "attention heads" differentially predict brain activity in specific cortical regions. These heads fall along gradients corresponding to different layers and context lengths in a low-dimensional cortical space.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11217339PMC
http://dx.doi.org/10.1038/s41467-024-49173-5DOI Listing

Publication Analysis

Top Keywords

integrate contextual
8
circuit computations
8
brain activity
8
computations
5
shared functional
4
functional specialization
4
specialization transformer-based
4
language
4
transformer-based language
4
language models
4

Similar Publications

Dietary modification has the potential to improve nutritional status and reduce environmental impacts of the food system. However, for many countries, the optimal composition of locally contextualized healthy and sustainable diets is unknown. The Gambia is vulnerable to climate-change-induced future water scarcity which may affect crop yields and the ability to supply healthy diets.

View Article and Find Full Text PDF

Self- and other-oriented harmful behaviors are common among emerging adults. Individuals who engage in both forms of behavior, termed dual-harm, experience more adverse outcomes in comparison to individuals who engage in either. This study examines temperamental traits, defined as reactive and regulative temperament, as transdiagnostic factors underlying engagement in self-oriented, other-oriented, and dual-harmful behaviors.

View Article and Find Full Text PDF

This study proposes a novel text classification model, MBConv-CapsNet, to address large-scale text data classification issues in the Internet era. Integrating the advantages of Mobile Inverted Bottleneck Convolutional Networks and Capsule Networks, this model comprehensively considers text sequence information, word embeddings, and contextual dependencies to capture both local and global information about the text effectively. It transforms from the original text matrix to a more compact and representative feature representation.

View Article and Find Full Text PDF

A cross-sectional study assessing barriers and facilitators to the sustainability of physical activity and nutrition interventions in early childhood education and care settings.

Int J Behav Nutr Phys Act

January 2025

Global Centre for Preventive Health and Nutrition, Institute for Health Transformation, School of Health and Social Development, Faculty of Health, Deakin University, Burwood, VIC, 3125, Australia.

Background: Effective evidence-based physical activity and nutrition interventions to prevent overweight and obesity and support healthy child development need to be sustained within Early Childhood Education and Care (ECEC) services. Despite this, little is known about factors that influence sustainability of these programs in ECEC settings. Therefore, the aim of this study was to describe the factors related to sustainability of physical activity and nutrition interventions in ECEC settings and examine their association with ECEC service characteristics.

View Article and Find Full Text PDF

Sensing-based deep brain stimulation should optimally consider both the motor and neuropsychiatric domain to maximize quality of life of Parkinson's disease (PD) patients. Here we characterize the neurophysiological properties of the subthalamic nucleus (STN) in 69 PD patients using a newly established neurophysiological gradient metric and contextualize it with motor symptoms and apathy. We could evidence a STN power gradient that holds most of the spectral information between 5 and 30 Hz spanning along the dorsal-ventral axis.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!