Our ability to make decisions is predicated upon our knowledge of the outcomes of the actions available to us. Reinforcement learning theory posits that actions followed by a reward or punishment acquire value through the computation of prediction errors-discrepancies between the predicted and the actual reward. A multitude of neuroimaging studies have demonstrated that rewards and punishments evoke neural responses that appear to reflect reinforcement learning prediction errors [e.g., Krigolson, O. E., Pierce, L. J., Holroyd, C. B., & Tanaka, J. W. Learning to become an expert: Reinforcement learning and the acquisition of perceptual expertise. Journal of Cognitive Neuroscience, 21, 1833-1840, 2009; Bayer, H. M., & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron, 47, 129-141, 2005; O'Doherty, J. P. Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Current Opinion in Neurobiology, 14, 769-776, 2004; Holroyd, C. B., & Coles, M. G. H. The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679-709, 2002]. Here, we used the brain ERP technique to demonstrate that not only do rewards elicit a neural response akin to a prediction error but also that this signal rapidly diminished and propagated to the time of choice presentation with learning. Specifically, in a simple, learnable gambling task, we show that novel rewards elicited a feedback error-related negativity that rapidly decreased in amplitude with learning. Furthermore, we demonstrate the existence of a reward positivity at choice presentation, a previously unreported ERP component that has a similar timing and topography as the feedback error-related negativity that increased in amplitude with learning. The pattern of results we observed mirrored the output of a computational model that we implemented to compute reward prediction errors and the changes in amplitude of these prediction errors at the time of choice presentation and reward delivery. Our results provide further support that the computations that underlie human learning and decision-making follow reinforcement learning principles.

Download full-text PDF

Source
http://dx.doi.org/10.1162/jocn_a_00509DOI Listing

Publication Analysis

Top Keywords

reinforcement learning
24
prediction errors
16
learning
12
error-related negativity
12
choice presentation
12
learning prediction
8
reward prediction
8
prediction error
8
error signal
8
time choice
8

Similar Publications

Constraint programming is known for being an efficient approach to solving combinatorial problems. Important design choices in a solver are the , designed to lead the search to the best solutions in a minimum amount of time. However, developing these heuristics is a time-consuming process that requires problem-specific expertise.

View Article and Find Full Text PDF

RL-QPSO net: deep reinforcement learning-enhanced QPSO for efficient mobile robot path planning.

Front Neurorobot

January 2025

Hebi Institute of Engineering and Technology, Henan Polytechnic University, Hebi, Henan, China.

Introduction: Path planning in complex and dynamic environments poses a significant challenge in the field of mobile robotics. Traditional path planning methods such as genetic algorithms, Dijkstra's algorithm, and Floyd's algorithm typically rely on deterministic search strategies, which can lead to local optima and lack global search capabilities in dynamic settings. These methods have high computational costs and are not efficient for real-time applications.

View Article and Find Full Text PDF

Memory consolidation from a reinforcement learning perspective.

Front Comput Neurosci

January 2025

Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon, Republic of Korea.

Memory consolidation refers to the process of converting temporary memories into long-lasting ones. It is widely accepted that new experiences are initially stored in the hippocampus as rapid associative memories, which then undergo a consolidation process to establish more permanent traces in other regions of the brain. Over the past two decades, studies in humans and animals have demonstrated that the hippocampus is crucial not only for memory but also for imagination and future planning, with the CA3 region playing a pivotal role in generating novel activity patterns.

View Article and Find Full Text PDF

The Dark Triad (DT), encompassing narcissism, Machiavellianism and psychopathy traits, poses significant societal challenges. Understanding the neural underpinnings of these traits is crucial for developing effective interventions and preventive strategies. Our study aimed to unveil the neural substrates of the DT by examining brain scans from 201 individuals (mean age: 32.

View Article and Find Full Text PDF

Delta-opioid receptors (δ-ORs) are known to be involved in associative learning and modulating motivational states. We wanted to study if they were also involved in naturally-occurring reinforcement learning behaviors such as vocal learning, using the zebra finch model system. Zebra finches learn to vocalize early in development and song learning in males is affected by factors such as the social environment and internal reward, both of which are modulated by endogenous opioids.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!