Dogs and laboratory mice are commonly trained to perform complex tasks by guiding them through a curriculum of simpler tasks ('shaping'). What are the principles behind effective shaping strategies? Here, we propose a machine learning framework for shaping animal behavior, where an autonomous teacher agent decides its student's task based on the student's transcript of successes and failures on previously assigned tasks. Using autonomous teachers that plan a curriculum in a common sequence learning task, we show that near-optimal shaping algorithms adaptively alternate between simpler and harder tasks to carefully balance reinforcement and extinction. Based on this intuition, we derive an adaptive shaping heuristic with minimal parameters, which we show is near-optimal on the sequence learning task and robustly trains deep reinforcement learning agents on navigation tasks that involve sparse, delayed rewards. Extensions to continuous curricula are explored. Our work provides a starting point towards a general computational framework for shaping animal behavior.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10723287 | PMC |
http://dx.doi.org/10.1101/2023.12.03.569774 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!