A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 176

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 316
Function: require_once

Model-Free λ-Policy Iteration for Discrete-Time Linear Quadratic Regulation. | LitMetric

This article presents a model-free λ -policy iteration ( λ -PI) for the discrete-time linear quadratic regulation (LQR) problem. To solve the algebraic Riccati equation arising from solving the LQR in an iterative manner, we define two novel matrix operators, named the weighted Bellman operator and the composite Bellman operator. Then, the λ -PI algorithm is first designed as a recursion with the weighted Bellman operator, and its equivalent formulation as a fixed-point iteration with the composite Bellman operator is shown. The contraction and monotonic properties of the composite Bellman operator guarantee the convergence of the λ -PI algorithm. In contrast to the PI algorithm, the λ -PI does not require an admissible initial policy, and the convergence rate outperforms the value iteration (VI) algorithm. Model-free extension of the λ -PI algorithm is developed using the off-policy reinforcement learning technique. It is also shown that the off-policy variants of the λ -PI algorithm are robust against the probing noise. Finally, simulation examples are conducted to validate the efficacy of the λ -PI algorithm.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2021.3098985DOI Listing

Publication Analysis

Top Keywords

bellman operator
20
-pi algorithm
20
composite bellman
12
discrete-time linear
8
linear quadratic
8
quadratic regulation
8
weighted bellman
8
-pi
7
algorithm
7
bellman
5

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!