The Computational Development of Reinforcement Learning during Adolescence.

Stefano Palminteri Emma J Kilford Giorgio Coricelli Sarah-Jayne Blakemore

PLoS Comput Biol

Institute of Cognitive Neuroscience, University College London, London, United Kingdom.

Published: June 2016

Adolescence is a period of life characterised by changes in learning and decision-making. Learning and decision-making do not rely on a unitary system, but instead require the coordination of different cognitive processes that can be mathematically formalised as dissociable computational modules. Here, we aimed to trace the developmental time-course of the computational modules responsible for learning from reward or punishment, and learning from counterfactual feedback. Adolescents and adults carried out a novel reinforcement learning paradigm in which participants learned the association between cues and probabilistic outcomes, where the outcomes differed in valence (reward versus punishment) and feedback was either partial or complete (either the outcome of the chosen option only, or the outcomes of both the chosen and unchosen option, were displayed). Computational strategies changed during development: whereas adolescents' behaviour was better explained by a basic reinforcement learning algorithm, adults' behaviour integrated increasingly complex computational features, namely a counterfactual learning module (enabling enhanced performance in the presence of complete feedback) and a value contextualisation module (enabling symmetrical reward and punishment learning). Unlike adults, adolescent performance did not benefit from counterfactual (complete) feedback. In addition, while adults learned symmetrically from both reward and punishment, adolescents learned from reward but were less likely to learn from punishment. This tendency to rely on rewards and not to consider alternative consequences of actions might contribute to our understanding of decision-making in adolescence.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4920542	PMC
http://dx.doi.org/10.1371/journal.pcbi.1004953	DOI Listing

Publication Analysis

Top Keywords

reinforcement learning

reward punishment

learning

learning decision-making

computational modules

punishment learning

module enabling

complete feedback

computational

reward

Similar Publications

ReBandit: Random Effects Based Online RL Algorithm for Reducing Cannabis Use.

IJCAI (U S)

August 2024

Department of Computer Science, Harvard University.

Susobhan Ghosh Yongyi Guo Pei-Yao Hung Lara Coughlin Erin Bonar

The escalating prevalence of cannabis use, and associated cannabis-use disorder (CUD), poses a significant public health challenge globally. With a notably wide treatment gap, especially among emerging adults (EAs; ages 18-25), addressing cannabis use and CUD remains a pivotal objective within the 2030 United Nations Agenda for Sustainable Development Goals (SDG). In this work, we develop an online reinforcement learning (RL) algorithm called reBandit which will be utilized in a mobile health study to deliver personalized mobile health interventions aimed at reducing cannabis use among EAs.

View Article and Find Full Text PDF

Similar Publications

Energy efficient group priority MAC protocol using hybrid Q-learning honey Badger Algorithm (QL-HBA) for IoT Networks.

Sci Rep

December 2024

Department of Information Security, School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, 632014, Tamil Nadu, India.

Ilayaraja Venkatachalam Sirajudeen Ameer John Jagadeesan Srinivasan Senthilnathan Palaniappan S K Somasundaram

In Internet of Things (IoT) networks, identifying the primary Medium Access Control (MAC) layer protocol which is suited for a service characteristic is necessary based on the requirements of the application. In this paper, we propose Energy Efficient and Group Priority MAC (EEGP-MAC) protocol using Hybrid Q-Learning Honey Badger Algorithm (QL-HBA) for IoT Networks. This algorithm employs reinforcement agents to select an environment based on predefined actions and tasks.

View Article and Find Full Text PDF

Similar Publications

Exploiting full-duplex opportunities in WLANs via a reinforcement learning-based medium access control protocol.

Sci Rep

December 2024

National University of Defense Technology, Changsha, Hunan, China.

Song Liu Peng Wei

In-band full-duplex communication has the potential to double the wireless channel capacity. However, how to efficiently transform the full-duplex gain at the physical layer into network throughput improvement is still a challenge, especially in dynamic communication environments. This paper presents a reinforcement learning-based full-duplex (RLFD) medium access control (MAC) protocol for wireless local-area networks (WLANs) with full-duplex access points.

View Article and Find Full Text PDF

Similar Publications

A dual-adaptive stochastic reinforcement chimp optimization algorithm for fire detection and multidimensional problem solving.

Sci Rep

December 2024

Department of Biosciences, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, 602 105, India.

Ziyang Zhang Lingye Tan Diego Martín Leren Qian Mohammad Khishe

Chimp optimization algorithm (CHOA) is a recently developed nature-inspired technique that mimics the swarm intelligence of chimpanzee colonies. However, the original CHOA suffers from slow convergence and a tendency to reach local optima when dealing with multidimensional problems. To address these limitations, we propose TASR-CHOA, a twofold adaptive stochastic reinforced variant.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!