Our ability to make decisions is predicated upon our knowledge of the outcomes of the actions available to us. Reinforcement learning theory posits that actions followed by a reward or punishment acquire value through the computation of prediction errors-discrepancies between the predicted and the actual reward. A multitude of neuroimaging studies have demonstrated that rewards and punishments evoke neural responses that appear to reflect reinforcement learning prediction errors [e.g., Krigolson, O. E., Pierce, L. J., Holroyd, C. B., & Tanaka, J. W. Learning to become an expert: Reinforcement learning and the acquisition of perceptual expertise. Journal of Cognitive Neuroscience, 21, 1833-1840, 2009; Bayer, H. M., & Glimcher, P. W. Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron, 47, 129-141, 2005; O'Doherty, J. P. Reward representations and reward-related learning in the human brain: Insights from neuroimaging. Current Opinion in Neurobiology, 14, 769-776, 2004; Holroyd, C. B., & Coles, M. G. H. The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679-709, 2002]. Here, we used the brain ERP technique to demonstrate that not only do rewards elicit a neural response akin to a prediction error but also that this signal rapidly diminished and propagated to the time of choice presentation with learning. Specifically, in a simple, learnable gambling task, we show that novel rewards elicited a feedback error-related negativity that rapidly decreased in amplitude with learning. Furthermore, we demonstrate the existence of a reward positivity at choice presentation, a previously unreported ERP component that has a similar timing and topography as the feedback error-related negativity that increased in amplitude with learning. The pattern of results we observed mirrored the output of a computational model that we implemented to compute reward prediction errors and the changes in amplitude of these prediction errors at the time of choice presentation and reward delivery. Our results provide further support that the computations that underlie human learning and decision-making follow reinforcement learning principles.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1162/jocn_a_00509 | DOI Listing |
Constraints
November 2024
Polytechnique Montréal, Montreal, Canada.
Constraint programming is known for being an efficient approach to solving combinatorial problems. Important design choices in a solver are the , designed to lead the search to the best solutions in a minimum amount of time. However, developing these heuristics is a time-consuming process that requires problem-specific expertise.
View Article and Find Full Text PDFFront Neurorobot
January 2025
Hebi Institute of Engineering and Technology, Henan Polytechnic University, Hebi, Henan, China.
Introduction: Path planning in complex and dynamic environments poses a significant challenge in the field of mobile robotics. Traditional path planning methods such as genetic algorithms, Dijkstra's algorithm, and Floyd's algorithm typically rely on deterministic search strategies, which can lead to local optima and lack global search capabilities in dynamic settings. These methods have high computational costs and are not efficient for real-time applications.
View Article and Find Full Text PDFFront Comput Neurosci
January 2025
Center for Synaptic Brain Dysfunctions, Institute for Basic Science, Daejeon, Republic of Korea.
Memory consolidation refers to the process of converting temporary memories into long-lasting ones. It is widely accepted that new experiences are initially stored in the hippocampus as rapid associative memories, which then undergo a consolidation process to establish more permanent traces in other regions of the brain. Over the past two decades, studies in humans and animals have demonstrated that the hippocampus is crucial not only for memory but also for imagination and future planning, with the CA3 region playing a pivotal role in generating novel activity patterns.
View Article and Find Full Text PDFEur J Neurosci
January 2025
Department of Psychology and Cognitive Sciences (DiPSCo), University of Trento, Trento, Italy.
The Dark Triad (DT), encompassing narcissism, Machiavellianism and psychopathy traits, poses significant societal challenges. Understanding the neural underpinnings of these traits is crucial for developing effective interventions and preventive strategies. Our study aimed to unveil the neural substrates of the DT by examining brain scans from 201 individuals (mean age: 32.
View Article and Find Full Text PDFBMC Neurosci
January 2025
National Brain Research Centre, Manesar, Gurugram, 122052, Haryana, India.
Delta-opioid receptors (δ-ORs) are known to be involved in associative learning and modulating motivational states. We wanted to study if they were also involved in naturally-occurring reinforcement learning behaviors such as vocal learning, using the zebra finch model system. Zebra finches learn to vocalize early in development and song learning in males is affected by factors such as the social environment and internal reward, both of which are modulated by endogenous opioids.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!