This paper presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms for Markov games. Using the idea of delayed Q-learning, the paper extends the well-known Nash Q-learning algorithm to build a new PAC MARL algorithm for general-sum Markov games. In addition to guiding the design of a provably PACMARL algorithm, the framework enables checking whether an arbitrary MARL algorithm is PAC. Comparative numerical results demonstrate the algorithm's performance and robustness.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10617487 | PMC |
http://dx.doi.org/10.1109/tac.2022.3219340 | DOI Listing |
Theory Biosci
January 2025
Faculty of Science and Engineering, Department of Biosciences, Swansea University, Singleton Park, Swansea, SA2 8PP, UK.
Despite being a powerful tool to model ecological interactions, traditional evolutionary game theory can still be largely improved in the context of population dynamics. One of the current challenges is to devise a cohesive theoretical framework for ecological games with density-dependent (or concentration-dependent) evolution, especially one defined by individual-level events. In this work, I use the notation of reaction networks as a foundation to propose a framework and show that classic two-strategy games are a particular case of the theory.
View Article and Find Full Text PDFPLoS One
September 2024
Business School, Shaoxing University, Shaoxing, Zhejiang, China.
This paper systematically analyzes the spatiotemporal evolution trends and macroeconomic driving factors of farmland transfer at the provincial level in China since 2005, aiming to offer a new perspective for understanding the dynamic mechanisms of China's farmland transfer. Through the integrated use of kernel density estimation, the Markov model, and panel quantile regression methods, this study finds the following: (1) Farmland transfer rates across Chinese provinces show an overall upward trend, but regional differences exhibit a "U-shaped" evolution characterized by initially narrowing and then widening; (2) although provinces have relatively stable farmland transfer levels, there is potential for dynamic transitions; (3) factors such as per capita arable land, farmers' disposable income, the social security level, the urban‒rural income gap, the urbanization rate, government intervention, and the marketization level significantly promote farmland transfer, while inclusive finance inhibits transfer, and agricultural mechanization level and population aging have heterogeneous impacts. Therefore, to achieve convergence of low farmland transfer regions to medium levels while promoting medium-level regions to higher levels, it is recommended that the government increase support for agricultural mechanization, increase farmers' income and social security levels, and optimize marketization processes and government intervention strategies.
View Article and Find Full Text PDFThis article investigates the problem of robust decentralized load frequency control (LFC) in multiarea interconnection power systems, in the presence of the external disturbances and stochastic abrupt variations, such as component failures and different load demands. To capture the different operating conditions of the load in the multiarea power systems, a Markov superposition technique is skillfully employed to model the system's component matrices. To solve the Nash equilibrium solution of the zero-sum differential game problem, an improved online adaptive dynamic programming (ADP) algorithm based on the experience replay technique (ERT) is developed, which addresses the nonlinear coupling difficulties encountered in solving the game algebraic Riccati equations (Game AREs).
View Article and Find Full Text PDFChaos
June 2024
School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China.
Spatial evolutionary games provide a valuable framework for elucidating the emergence and maintenance of cooperative behaviors. However, most previous studies assume that individuals are profiteers and neglect to consider the effects of memory. To bridge this gap, in this paper, we propose a memory-based spatial evolutionary game with dynamic interaction between learners and profiteers.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!