The goal of reinforcement learning (RL) is to let an agent learn an optimal control policy in an unknown environment so that future expected rewards are maximized. The model-free RL approach directly learns the policy based on data samples. Although using many samples tends to improve the accuracy of policy learning, collecting a large number of samples is often expensive in practice. On the other hand, the model-based RL approach first estimates the transition model of the environment and then learns the policy based on the estimated transition model. Thus, if the transition model is accurately learned from a small amount of data, the model-based approach is a promising alternative to the model-free approach. In this paper, we propose a novel model-based RL method by combining a recently proposed model-free policy search method called policy gradients with parameter-based exploration and the state-of-the-art transition model estimator called least-squares conditional density estimation. Through experiments, we demonstrate the practical usefulness of the proposed method.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neunet.2014.06.006DOI Listing

Publication Analysis

Top Keywords

transition model
16
policy gradients
8
gradients parameter-based
8
parameter-based exploration
8
least-squares conditional
8
conditional density
8
density estimation
8
model-free approach
8
learns policy
8
policy based
8

Similar Publications

Cardiac growth patterns and metabolism before and after birth in swine: Role of miR in proliferation, hypertrophy and metabolism.

J Mol Cell Cardiol Plus

September 2024

Early Origins of Adult Health Research Group, Health and Biomedical Innovation, UniSA: Clinical and Health Sciences, University of South Australia, Adelaide, SA 5001, Australia.

The adult mammalian heart is unable to undergo cardiac repair, limiting potential treatment options after cardiac damage. However, the fetal heart is capable of cardiac repair. In preparation for birth, cardiomyocytes (CMs) undergo major maturational changes that include exit from the cell cycle, hypertrophic growth, and mitochondrial maturation.

View Article and Find Full Text PDF

Cancer cells within tumors exhibit a wide range of phenotypic states driven by non-genetic mechanisms in addition to extensively studied genetic alterations. Conversions among cancer cell states can result in intratumoral heterogeneity which contributes to metastasis and development of drug resistance. However, mechanisms underlying the initiation and/or maintenance of such phenotypic plasticity are poorly understood.

View Article and Find Full Text PDF

Transcription introduces torsional stress in the DNA fiber causing it to transition from a relaxed to a supercoiled state that can propagate across several kilobases and modulate the binding and activity of DNA-associated proteins. As a result, transcription at one locus has the potential to impact nearby transcription events. In this study, we asked how DNA supercoiling affects histone modifications and transcription of neighboring genes in the multicellular eukaryote .

View Article and Find Full Text PDF

Artificial intelligence (AI) is increasingly used in many medical specialties. However, nephrology has lagged in adopting and incorporating machine learning techniques. Nephrology is well positioned to capitalize on the benefits of AI.

View Article and Find Full Text PDF

A sustainability perspective on careers builds a resilient career system by simultaneously considering individual's current career needs and long-term career outcomes. The individual career agent's strategy for achieving a sustainable career is a powerful approach to career development in an environment where an individual's proactive career competencies are emphasized. Mid-career individuals, especially those facing career transitions, need to increase their sustainability by maintaining, renewing, and developing their current careers.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!