This paper focuses on a class of reinforcement learning problems where significant events are rare and limited to a single positive reward per episode. A typical example is that of an agent who has to choose a partner to cooperate with, while a large number of partners are simply not interested in cooperating, regardless of what the agent has to offer. We address this problem in a continuous state and action space with two different kinds of search methods: a gradient policy search method and a direct policy search method using an evolution strategy.
View Article and Find Full Text PDFIn this paper, we present an implementation of social learning for swarm robotics. We consider social learning as a distributed online reinforcement learning method applied to a collective of robots where sensing, acting and coordination are performed on a local basis. While some issues are specific to artificial systems, such as the general objective of learning efficient (and ideally, optimal) behavioural strategies to fulfill a task defined by a supervisor, some other issues are shared with social learning in natural systems.
View Article and Find Full Text PDF