In this article, a model-free Q-learning algorithm is proposed to solve the tracking problem of linear discrete-time systems with completely unknown system dynamics. To eliminate tracking errors, a performance index of the Q-learning approach is formulated, which can transform the tracking problem into a regulation one. Compared with the existing adaptive dynamic programming (ADP) methods and Q-learning approaches, the proposed performance index adds a product term composed of a gain matrix and the reference tracking trajectory to the control input quadratic form. In addition, without requiring any prior knowledge of the dynamics of the original controlled system and command generator, the control policy obtained by the proposed approach can be deduced by an iterative technique relying on the online information of the system state, the control input, and the reference tracking trajectory. In each iteration of the proposed method, the desired control input can be updated by the iterative criteria derived from a precondition of the controlled system and the reference tracking trajectory, which ensures that the obtained control policy can eliminate tracking errors in theory. Moreover, to effectively use less data to obtain the optimal control policy, the off-policy approach is introduced into the proposed algorithm. Finally, the effectiveness of the proposed algorithm is verified by a numerical simulation.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2022.3195357DOI Listing

Publication Analysis

Top Keywords

tracking problem
12
reference tracking
12
tracking trajectory
12
control input
12
control policy
12
model-free q-learning
8
tracking
8
problem linear
8
linear discrete-time
8
discrete-time systems
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!