Neural Q-learning for discrete-time nonlinear zero-sum games with adjustable convergence rate.

Neural Netw

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China; Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing 100124, China. Electronic address:

Published: July 2024

In this paper, an adjustable Q-learning scheme is developed to solve the discrete-time nonlinear zero-sum game problem, which can accelerate the convergence rate of the iterative Q-function sequence. First, the monotonicity and convergence of the iterative Q-function sequence are analyzed under some conditions. Moreover, by employing neural networks, the model-free tracking control problem can be overcome for zero-sum games. Second, two practical algorithms are designed to guarantee the convergence with accelerated learning. In one algorithm, an adjustable acceleration phase is added to the iteration process of Q-learning, which can be adaptively terminated with convergence guarantee. In another algorithm, a novel acceleration function is developed, which can adjust the relaxation factor to ensure the convergence. Finally, through a simulation example with the practical physical background, the fantastic performance of the developed algorithm is demonstrated with neural networks.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neunet.2024.106274DOI Listing

Publication Analysis

Top Keywords

discrete-time nonlinear
8
nonlinear zero-sum
8
zero-sum games
8
convergence rate
8
iterative q-function
8
q-function sequence
8
neural networks
8
convergence
6
neural q-learning
4
q-learning discrete-time
4

Similar Publications

Frequency-domain-based nonlinear normalized iterative learning control for three-dimensional ball screw drive systems.

ISA Trans

December 2024

College of Information Science and Engineering, Huaqiao University, Xiamen, 361002, China. Electronic address:

Iterative learning control (ILC) is a well-established method for achieving precise tracking in repetitive tasks. However, most ILC algorithms rely on a nominal plant model, making them susceptible to model mismatches. This paper introduces a novel normalization concept, developed from a frequency-domain perspective using a data-driven approach, thus eliminating the need for system model information.

View Article and Find Full Text PDF

Adaptive discrete-time neural prescribed performance control: A safe control approach.

Neural Netw

December 2024

Air and Missile Defense College, Air Force Engineering University, Xi'an, 710051, Shanxi, China. Electronic address:

Most existing results on prescribed performance control (PPC), subject to input saturation and initial condition limitations, focus on continuous-time nonlinear systems. This article, as regards discrete-time nonlinear systems, is dedicated to constructing a novel adaptive switching control strategy to circumvent the singular problem when the PPC undergoes input saturation, while the initial conditions of the system can be released under the framework of PPC. The main design steps and characteristics include: (1) By devising a new discrete-time global finite-time performance function (DTGFTPF), the constructed performance boundary is shown to survive insensitive to arbitrary initial values, which present in the system; (2) A discrete-time adaptive finite-time prescribed performance controller (DTAFPPC) and a discrete-time adaptive backstepping controller (DTABC) are constructed, simultaneously.

View Article and Find Full Text PDF

Dispersal induced catastrophic bifurcations, Arnold tongues, shrimp structures, and stock patterns in an ecological system.

Chaos

December 2024

Differential Equations, Modeling and Simulation Group, Department of Mathematics, Indian Institute of Technology Indore, Khandwa Road, Simrol, Indore 453552, Madhya Pradesh, India.

Article Synopsis
  • The paper analyzes a discrete-time predator-prey model in a two-patch environment, focusing on prey and predator dispersal using logistic growth and Holling type-II responses.
  • The study identifies multiple equilibria and their stability, revealing phenomena like bifurcations, chaos, and a unique 10-period orbit with closed invariant curves.
  • It challenges common assumptions by demonstrating how dispersal can sometimes stabilize coexisting populations and explores complex dynamics in a two-parameter space, contributing to the understanding of predator-prey interactions.
View Article and Find Full Text PDF

This paper explores a discrete-time system derived from the well-known continuous-time Rosenzweig-MacArthur model using the piecewise constant argument. Examining the impact of increasing carrying capacity and harvesting efforts, we uncover intricate phenomena, such as periodicity, quasiperiodicity, period-doubling, period-bubbling, and chaos. Our analysis reveals that increasing the carrying capacity of prey species can lead to both system stabilization and destabilization.

View Article and Find Full Text PDF

Optimizing EPR pulses for broadband excitation and refocusing.

J Magn Reson

December 2024

Department of Physics, University of Maryland Baltimore County, Baltimore, MD 21250, USA. Electronic address:

In this paper, we numerically optimize broadband pulse shapes that maximize Hahn echo amplitudes. Pulses are parameterized as neural networks (NN), nonlinear amplitude limited Fourier series (FS), and discrete time series (DT). These are compared to an optimized choice of the conventional hyperbolic secant (HS) pulse shape.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!