Actor-critic based on the policy gradient (PG-based AC) methods have been widely studied to solve learning control problems. In order to increase the data efficiency of learning prediction in the critic of PG-based AC, studies on how to use recursive least-squares temporal difference (RLS-TD) algorithms for policy evaluation have been conducted in recent years. In such contexts, the critic RLS-TD evaluates an unknown mixed policy generated by a series of different actors, but not one fixed policy generated by the current actor. Therefore, this AC framework with RLS-TD critic cannot be proved to converge to the optimal fixed point of learning problem. To address the above problem, this paper proposes a new AC framework named critic-iteration PG (CIPG), which learns the state-value function of current policy in an on-policy way and performs gradient ascent in the direction of improving discounted total reward. During each iteration, CIPG keeps the policy parameters fixed and evaluates the resulting fixed policy by -regularized RLS-TD critic. Our convergence analysis extends previous convergence analysis of PG with function approximation to the case of RLS-TD critic. The simulation results demonstrate that the -regularization term in the critic of CIPG is undamped during the learning process, and CIPG has better learning efficiency and faster convergence rate than conventional AC learning control methods.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2018.2808203DOI Listing

Publication Analysis

Top Keywords

learning control
12
rls-td critic
12
policy generated
8
fixed policy
8
convergence analysis
8
policy
7
learning
6
critic
6
rls-td
5
actor-critic learning
4

Similar Publications

Background: Prevention of depression, stress, and anxiety is a serious concern of the World Health Organization and has been a research topic over the last decades. There is a close association found between children's mental health problems and parental stress, valid for biological and foster families. Evidence suggests that parents with children with disabilities are more stressed, depressed, or anxious than parents who do not have such children.

View Article and Find Full Text PDF

Amnesia is a memory disorder marked by the inability to recall or acquire information. Hence, drugs that also target the neurogenesis process constitute a hope to discover a cure against memory disorders. This study is aimed at evaluating the antiamnesic and neurotrophic effects of the aqueous extract of () on in vivo and in vitro models of excitotoxicity.

View Article and Find Full Text PDF

Introduction: Diagnostic performance of optical coherence tomography (OCT) to detect Alzheimer's disease (AD) and mild cognitive impairment (MCI) remains limited. We aimed to develop a deep-learning algorithm using OCT to detect AD and MCI.

Methods: We performed a cross-sectional study involving 228 Asian participants (173 cases/55 controls) for model development and testing on 68 Asian (52 cases/16 controls) and 85 White (39 cases/46 controls) participants.

View Article and Find Full Text PDF

This dataset is generated from real-time simulations conducted in MATLAB/Simscape, focusing on the impact of smart noise signals on battery energy storage systems (BESS). Using Deep Reinforcement Learning (DRL) agent known as Proximal Policy Optimization (PPO), noise signals in the form of subtle millivolt and milliampere variations are strategically created to represent realistic cases of False Data Injection Attacks (FDIA). These signals are designed to disrupt the State of Charge (SoC) and State of Health (SoH) estimation blocks within Unscented Kalman Filters (UKF).

View Article and Find Full Text PDF

Can serum orexin levels be used as a marker in childhood epilepsy?

Heliyon

January 2025

Department of Pediatric Neurology, University of Health Sciences, Gülhane Training and Research Hospital, Ankara, the Republic of Türkiye.

Objective: Epilepsy is one of the most common neurological diseases in the pediatric population. Orexins are excitatory peptides and associated with energy homeostasis, eating and drinking behaviors, sleep regulation, sleep-wake periods, analgesia, and cognitive activities such as attention, learning, and memory. The aim of this study was to reveal the relationship between plasma orexin levels and seizures in pediatric epilepsy patients with seizures, epilepsy patients in remission, and healthy control group with similar demographic characteristics.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!