A One-Shot Shift from Explore to Exploit in Monkey Prefrontal Cortex.

J Neurosci

MRC Cognition and Brain Sciences Unit, University of Cambridge, CB2 7EF, Cambridge, United Kingdom

Published: January 2022

Much animal learning is slow, with cumulative changes in behavior driven by reward prediction errors. When the abstract structure of a problem is known, however, both animals and formal learning models can rapidly attach new items to their roles within this structure, sometimes in a single trial. Frontal cortex is likely to play a key role in this process. To examine information seeking and use in a known problem structure, we trained monkeys in an explore/exploit task, requiring the animal first to test objects for their association with reward, then, once rewarded objects were found, to reselect them on further trials for further rewards. Many cells in the frontal cortex showed an explore/exploit preference aligned with one-shot learning in the monkeys' behavior: the population switched from an explore state to an exploit state after a single trial of learning but partially maintained the explore state if an error indicated that learning had failed. Binary switch from explore to exploit was not explained by continuous changes linked to expectancy or prediction error. Explore/exploit preferences were independent for two stages of the trial: object selection and receipt of feedback. Within an established task structure, frontal activity may control the separate processes of explore and exploit, switching in one trial between the two. Much animal learning is slow, with cumulative changes in behavior driven by reward prediction errors. When the abstract structure a problem is known, however, both animals and formal learning models can rapidly attach new items to their roles within this structure. To address transitions in neural activity during one-shot learning, we trained monkeys in an explore/exploit task using familiar objects and a highly familiar task structure. When learning was rapid, many frontal neurons showed a binary, one-shot switch between explore and exploit. Within an established task structure, frontal activity may control the separate operations of exploring alternative objects to establish their current role, then exploiting this knowledge for further reward.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8802942PMC
http://dx.doi.org/10.1523/JNEUROSCI.1338-21.2021DOI Listing

Publication Analysis

Top Keywords

explore exploit
16
task structure
12
learning
9
animal learning
8
learning slow
8
slow cumulative
8
cumulative changes
8
changes behavior
8
behavior driven
8
driven reward
8

Similar Publications

Antibody-drug conjugates (ADCs) have emerged as a promising strategy in targeted cancer therapy, enabling the precise delivery of cytotoxic agents to tumor sites while minimizing systemic toxicity. However, traditional ADCs face significant limitations, including restricted drug loading capacity, where an optimal drug-to-antibody ratio (DAR) is crucial; low DARs may lead to insufficient potency, while high DARs can cause rapid clearance and increased toxicity. Additionally, ADCs often suffer from instability in circulation due to the potential for premature release of cytotoxic agents, resulting in off-target effects and reduced therapeutic efficacy.

View Article and Find Full Text PDF

Dung Beetle algorithm is an intelligent optimization algorithm with advantages in exploitation ability. However, due to the high randomness of parameters, premature convergence and other reasons, there is an imbalance between exploration and exploitation ability, and it is easy to fall into the problem of local optimal solution. The purpose of this study is to improve the optimization performance of dung beetle algorithm and explore its engineering application value.

View Article and Find Full Text PDF

Generative large language models (LLMs) like ChatGPT can quickly produce informative essays on various topics. However, the information generated cannot be fully trusted as artificial intelligence (AI) can make factual mistakes. This poses challenges for using such tools in college classrooms.

View Article and Find Full Text PDF

Polysaccharides, found universally in all living-species, exhibit diverse biochemical structures and play crucial roles in microorganisms, animals, and plants to defend against pathogens, environmental stress and climate-changing. Microbial exopolysaccharides are essential for cell adhesion and stress resilience and using them has notable advantages over synthetic polysaccharides. Exopolysaccharides have versatile structures and physicochemical properties, used in food systems, therapeutics, cosmetics, agriculture, and polymer industries.

View Article and Find Full Text PDF

Enterococcus faecalis is a multi-drug-resistant human pathogen that is found in a variety of environments and is challenging to treat. Under stress conditions, some bacteria regulate intracellular polyamine concentrations via polyamine acetyltransferases to reduce their toxicity. The E.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!