Training agents via deep reinforcement learning with sparse rewards for robotic control tasks in vast state space are a big challenge, due to the rareness of successful experience. To solve this problem, recent breakthrough methods, the hindsight experience replay (HER) and aggressive rewards to counter bias in HER (ARCHER), use unsuccessful experiences and consider them as successful experiences achieving different goals, for example, hindsight experiences. According to these methods, hindsight experience is used at a fixed sampling rate during training. However, this usage of hindsight experience introduces bias, due to a distinct optimal policy, and does not allow the hindsight experience to take variable importance at different stages of training. In this article, we investigate the impact of a variable sampling rate, representing the variable rate of hindsight experience, on training performance and propose a sampling rate decay strategy that decreases the number of hindsight experiences as training proceeds. The proposed method is validated with three robotic control tasks included in the OpenAI Gym suite. The experimental results demonstrate that the proposed method achieves improved training performance and increased convergence speed over the HER and ARCHER with two of the three tasks and comparable training performance and convergence speed with the other one.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TCYB.2020.2990722 | DOI Listing |
Cortex
January 2025
Department of Psychology, University of Illinois Urbana-Champaign, United States; Neuroscience Program, University of Illinois Urbana-Champaign, United States.
While pre-verbal infants may be sensitive to others' mental states, they are not able to accurately answer questions about them until several years later, an ability referred to as having a theory of mind. Here we ask whether infant social-cognitive sensitivity is subserved by the same brain mechanisms as those that support theory of mind in childhood. To do so, we explored the relationship between functional sensitivity of the right temporal-parietal junction to mental state processing in infancy, a region known to underlie theory of mind in older children, and explicit theory of mind reasoning in the same group several years later.
View Article and Find Full Text PDFPrehosp Disaster Med
January 2025
Assistant Professor, Department of Internal Medicine, UT Southwestern Medical Center, Statistician/Section Chief of Analytics, Research Service, VA North Texas HCS, Dallas, TexasUSA.
Introduction: Terrorism and trauma survivors often experience changes in biomarkers of autonomic, inflammatory and hypothalamic-pituitary-adrenal (HPA) axis assessed at various times. Research suggests interactions of these systems in chronic stress.
Study Objective: This unprecedented retrospective study explores long-term stress biomarkers in three systems in terrorism survivors.
Sci Rep
November 2024
The College of Shipbuilding Engineering, Harbin Engineering University, Harbin, 150001, China.
Agile and adaptive maneuvers such as fall recovery, high-speed turning, and sprinting in the wild are challenging for legged systems. We propose a Curricular Hindsight Reinforcement Learning (CHRL) that learns an end-to-end tracking controller that achieves powerful agility and adaptation for the legged robot. The two key components are (i) a novel automatic curriculum strategy on task difficulty and (ii) a Hindsight Experience Replay strategy adapted to legged locomotion tasks.
View Article and Find Full Text PDFNeural Netw
January 2025
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, China. Electronic address:
Patient Educ Couns
January 2025
Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy; Applied Research Division for Cognitive and Psychological Science, IEO European Insitute of Oncology IRCCS, Milan, Italy.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!