Curiosity driven reinforcement learning for motion planning on humanoids.

Front Neurorobot

Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland ; Facoltà di Scienze Informatiche, Università della Svizzera Italiana Lugano, Switzerland ; Dipartimento Tecnologie Innovative, Scuola Universitaria Professionale della Svizzera Italiana Manno, Switzerland.

Published: January 2014

AI Article Synopsis

  • Previous research on artificial curiosity (AC) primarily revolves around basic theories and is often limited to simple experimental scenarios.
  • The study introduces a curious agent embodied in the complex iCub humanoid robot, utilizing a novel reinforcement learning framework that combines both reactive control and high-level curiosity-driven exploration.
  • This is the first instance of an embodied curious agent capable of real-time motion planning, demonstrating intelligent exploration by learning Markov models and showing interest in physical constraints and environmental objects.

Article Abstract

Most previous work on artificial curiosity (AC) and intrinsic motivation focuses on basic concepts and theory. Experimental results are generally limited to toy scenarios, such as navigation in a simulated maze, or control of a simple mechanical system with one or two degrees of freedom. To study AC in a more realistic setting, we embody a curious agent in the complex iCub humanoid robot. Our novel reinforcement learning (RL) framework consists of a state-of-the-art, low-level, reactive control layer, which controls the iCub while respecting constraints, and a high-level curious agent, which explores the iCub's state-action space through information gain maximization, learning a world model from experience, controlling the actual iCub hardware in real-time. To the best of our knowledge, this is the first ever embodied, curious agent for real-time motion planning on a humanoid. We demonstrate that it can learn compact Markov models to represent large regions of the iCub's configuration space, and that the iCub explores intelligently, showing interest in its physical constraints as well as in objects it finds in its environment.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3881010PMC
http://dx.doi.org/10.3389/fnbot.2013.00025DOI Listing

Publication Analysis

Top Keywords

curious agent
12
reinforcement learning
8
motion planning
8
curiosity driven
4
driven reinforcement
4
learning motion
4
planning humanoids
4
humanoids previous
4
previous work
4
work artificial
4

Similar Publications

Fluid preserved animal specimens in the collections of natural history museums constitute an invaluable archive of past and present animal diversity. Well-preserved specimens have a shelf-life spanning centuries and are widely used for e.g.

View Article and Find Full Text PDF

Introduction: Children are naturally curious and often have limited self-control, leading them to imitate both safe and dangerous actions. This study aimed to investigate whether dangerous cues could effectively inhibit children's imitation of hazardous behaviors and to compare the effectiveness of picture cues versus word cues in reducing this imitation.

Methods: Seventy-six children were divided into two groups: one group received picture cues, and the other received word cues.

View Article and Find Full Text PDF

Background: Family health history (FHx) is an important predictor of a person's genetic risk but is not collected by many adults in the United States.

Objective: This study aims to test and compare the usability, engagement, and report usefulness of 2 web-based methods to collect FHx.

Methods: This mixed methods study compared FHx data collection using a flow-based chatbot (KIT; the curious interactive test) and a form-based method.

View Article and Find Full Text PDF
Article Synopsis
  • * The patient experienced swelling and pain in the knee, initially misdiagnosed, and underwent multiple tests before the correct treatment revealed an infection resistant to standard antibiotics.
  • * The case highlights the challenges of diagnosing infections in patients on immunosuppressive medications and raises questions about the management of these individuals following treatment.
View Article and Find Full Text PDF

A coverage assumption is critical with policy gradient methods, because while the objective function is insensitive to updates in unlikely states, the agent may need improvements in those states to reach a nearly optimal payoff. However, this assumption can be unfeasible in certain environments, for instance in online learning, or when restarts are possible only from a fixed initial state. In these cases, classical policy gradient algorithms like REINFORCE can have poor convergence properties and sample efficiency.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!