IEEE Trans Neural Netw Learn Syst
April 2024
In this article, we investigate the Nash-seeking problem of a set of agents, playing an infinite network aggregative Markov game. In particular, we focus on a noncooperative framework where each agent selfishly aims at maximizing its long-term average reward without having explicit information on the model of the environment dynamics and its own reward function. The main contribution of this article is to develop a continuous multiagent reinforcement learning (MARL) algorithm for the Nash-seeking problem in infinite dynamic games with convergence guarantee.
View Article and Find Full Text PDF