Reinforcement learning in environments with many action-state pairs is challenging. The issue is the number of episodes needed to thoroughly search the policy space. Most conventional heuristics address this search problem in a stochastic manner. This can leave large portions of the policy space unvisited during the early training stages. In this paper, we propose an uncertainty-based, information-theoretic approach for performing guided stochastic searches that more effectively cover the policy space. Our approach is based on the value of information, a criterion that provides the optimal tradeoff between expected costs and the granularity of the search process. The value of information yields a stochastic routine for choosing actions during learning that can explore the policy space in a coarse to fine manner. We augment this criterion with a state-transition uncertainty factor, which guides the search process into previously unexplored regions of the policy space. We evaluate the uncertainty-based value-of-information policies on the games Centipede and Crossy Road. Our results indicate that our approach yields better performing policies in fewer episodes than stochastic-based exploration strategies. We show that the training rate for our approach can be further improved by using the policy cross entropy to guide our criterion's hyperparameter selection.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2018.2812709DOI Listing

Publication Analysis

Top Keywords

policy space
20
uncertainty-based value-of-information
8
search process
8
policy
6
space
5
guided policy
4
policy exploration
4
exploration markov
4
markov decision
4
decision processes
4

Similar Publications

Background: The San Joaquin Valley (SJV) in California is one of the most polluted regions in the U.S. This study examined favorability for air pollution mitigation policies, interventions, and identified predictors amongst region's residents.

View Article and Find Full Text PDF

Background: Advances in digital healthcare and health information provide benefits to the public. However, lack of digital skills together with access, confidence, trust and motivation issues present seemingly insurmountable barriers for many. Such digital health exclusion exacerbates existing health inequalities experienced by older people, people with less income, less education or who don't have English as a first language.

View Article and Find Full Text PDF

The assessment of research performance is widely seen as a vital tool in upholding the highest standards of quality, with selection and competition believed to drive progress. Academic institutions need to take critical decisions on hiring and promotion, while facing external pressure by also being subject to research assessment. Here we present an outlook on research assessment for career progression with specific focus on promotion to full professorship, based on 314 policies from 190 academic institutions and 218 policies from 58 government agencies, covering 32 countries in the Global North and 89 countries in the Global South.

View Article and Find Full Text PDF

In the context of population ageing, the age-friendliness of neighborhood built environment (NBE) is increasingly recognized as essential for enabling ageing in place. However, while much research has focused on the impact of NBE on the physical health of older adults, its relationship with mental health (MH) remains underexplored, especially the pathways through which NBE indicators influence MH. This study measured NBE using ten indicators across three categories: daily travel (including barrier-free travel, elevator, rest seat, diversion of pedestrian and vehicle, road surface and public toilet), healthcare services (including public canteen and elderly care), and social participation (including outdoor fitness space and indoor activity space).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!