IEEE Trans Neural Netw Learn Syst
June 2022
In the current matrix factorization recommendation approaches, the item and the user latent factor vectors are with the same dimension. Thus, the linear dot product is used as the interactive function between the user and the item to predict the ratings. However, the relationship between real users and items is not entirely linear and the existing recommendation model of matrix factorization faces the challenge of data sparsity.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
April 2022
In this article, we propose a novel semicentralized deep deterministic policy gradient (SCDDPG) algorithm for cooperative multiagent games. Specifically, we design a two-level actor-critic structure to help the agents with interactions and cooperation in the StarCraft combat. The local actor-critic structure is established for each kind of agents with partially observable information received from the environment.
View Article and Find Full Text PDFIn this paper, we study the constrained optimization problem of a class of uncertain nonlinear interconnected systems. First, we prove that the solution of the constrained optimization problem can be obtained through solving an array of optimal control problems of constrained auxiliary subsystems. Then, under the framework of approximate dynamic programming, we present a simultaneous policy iteration (SPI) algorithm to solve the Hamilton-Jacobi-Bellman equations corresponding to the constrained auxiliary subsystems.
View Article and Find Full Text PDFIEEE Trans Cybern
November 2019
The adaptive dynamic programming controller usually needs a long training period because the data usage efficiency is relatively low by discarding the samples once used. Prioritized experience replay (ER) promotes important experiences and is more efficient in learning the control process. This paper proposes integrating an efficient learning capability of prioritized ER design into heuristic dynamic programming (HDP).
View Article and Find Full Text PDFIEEE Trans Cybern
May 2018
In this paper, we present a new model-free globalized dual heuristic dynamic programming (GDHP) approach for the discrete-time nonlinear zero-sum game problems. First, the online learning algorithm is proposed based on the GDHP method to solve the Hamilton-Jacobi-Isaacs equation associated with optimal regulation control problem. By setting backward one step of the definition of performance index, the requirement of system dynamics, or an identifier is relaxed in the proposed method.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
August 2017
In this paper, an event-triggered near optimal control structure is developed for nonlinear continuous-time systems with control constraints. Due to the saturating actuators, a nonquadratic cost function is introduced and the Hamilton-Jacobi-Bellman (HJB) equation for constrained nonlinear continuous-time systems is formulated. In order to solve the HJB equation, an actor-critic framework is presented.
View Article and Find Full Text PDFIEEE Trans Cybern
October 2017
Goal representation globalized dual heuristic dynamic programming (Gr-GDHP) method is proposed in this paper. A goal neural network is integrated into the traditional GDHP method providing an internal reinforcement signal and its derivatives to help the control and learning process. From the proposed architecture, it is shown that the obtained internal reinforcement signal and its derivatives can be able to adjust themselves online over time rather than a fixed or predefined function in literature.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
July 2017
This paper presents the design of a novel adaptive event-triggered control method based on the heuristic dynamic programming (HDP) technique for nonlinear discrete-time systems with unknown system dynamics. In the proposed method, the control law is only updated when the event-triggered condition is violated. Compared with the periodic updates in the traditional adaptive dynamic programming (ADP) control, the proposed method can reduce the computation and transmission cost.
View Article and Find Full Text PDFIEEE Trans Cybern
March 2017
This paper proposes a novel event-triggered adaptive dynamic programming (ADP) control method for nonlinear continuous-time system with unknown internal states. Comparing with the traditional ADP design with a fixed sample period, the event-triggered method samples the state and updates the controller only when it is necessary. Therefore, the computation cost and transmission load are reduced.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
December 2016
Goal representation heuristic dynamic programming (GrHDP) control design has been developed in recent years. The control performance of this design has been demonstrated in several case studies, and also showed applicable to industrial-scale complex control problems. In this paper, we develop the theoretical analysis for the GrHDP design under certain conditions.
View Article and Find Full Text PDFModel-based dual heuristic dynamic programming (MB-DHP) is a popular approach in approximating optimal solutions in control problems. Yet, it usually requires offline training for the model network, and thus resulting in extra computational cost. In this brief, we propose a model-free DHP (MF-DHP) design based on finite-difference technique.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
December 2014
In this paper, we develop and analyze an optimal control method for a class of discrete-time nonlinear Markov jump systems (MJSs) with unknown system dynamics. Specifically, an identifier is established for the unknown systems to approximate system states, and an optimal control approach for nonlinear MJSs is developed to solve the Hamilton-Jacobi-Bellman equation based on the adaptive dynamic programming technique. We also develop detailed stability analysis of the control approach, including the convergence of the performance index function for nonlinear MJSs and the existence of the corresponding admissible control.
View Article and Find Full Text PDF