This article dedicates to investigating a methodology for enhancing adaptability to environmental changes of reinforcement learning (RL) techniques with data efficiency, by which a joint control protocol is learned using only data for multiagent systems (MASs). Thus, all followers are able to synchronize themselves with the leader and minimize their individual performance. To this end, an optimal synchronization problem of heterogeneous MASs is first formulated, and then an arbitration RL mechanism is developed for well addressing key challenges faced by the current RL techniques, that is, insufficient data and environmental changes.
View Article and Find Full Text PDFIEEE Trans Cybern
November 2024
This article is concerned with the problem of reference tracking for the lower-triangular nonlinear systems with a chain of odd powers. Contrary to most of the related studies, this work is focused on the case where neither the odd powers nor their bounds are known. This renders the majority of the existing methods for stability analysis and control design for the odd-power systems infeasible.
View Article and Find Full Text PDFThis article studies the containment control problem of nonlinear multiagent systems (MASs) subjected to communication link faults and dead-zone inputs. In case of an unknown fault in the communication link, there is no constant Laplacian matrix anymore and each follower agent cannot be informed of the global information simultaneously. To deal with this problem, an adaptive compensating estimator is constructed to estimate the signal spanned by the leaders.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
July 2024
In mineral processing, the dynamic nature of industrial data poses challenges for decision-makers in accurately assessing current production statuses. To enhance the decision-making process, it is crucial to predict comprehensive production indices (CPIs), which are influenced by both human operators and industrial processes, and demonstrate a strong dual-scale property. To improve the accuracy of CPIs' prediction, we introduce the high-frequency (HF) unit and low-frequency (LF) unit within our proposed dual-scale deep learning (DL) network.
View Article and Find Full Text PDFIn this article, the method of dynamic performance monitoring and adaptive self-tuning of parameters for actual PID control systems of industrial processes in virtual reality scenes is proposed. This method combines the digital twin model of the PID control process based on system identification and adaptive deep learning and the PID tuning intelligent algorithm based on reinforcement learning with virtual reality and immersive interaction of industrial metaverse. An industrial metaverse-based intelligent PID tuning system is proposed by combining the above method with the end-edge-cloud collaboration technology of Industrial Internet.
View Article and Find Full Text PDFThis article concerns nonlinear model predictive control (MPC) with guaranteed feasibility of inequality path constraints (PCs). For MPC with PCs, the existing methods, such as direct multiple shooting, cannot guarantee feasibility of PCs because the PCs are enforced at finitely many time points only. Therefore, this article presents a novel MPC framework that is capable of not only achieving stability control but also guaranteeing feasibility of PCs during the rolling optimization stages of MPC.
View Article and Find Full Text PDFEfficient monitoring of production performance is crucial for ensuring safe operations and enhancing the economic benefits of the Iron and Steel Corporation. Although basic modeling algorithms and visualization diagrams are available in many scientific platforms and industrial applications, there is still a lack of customized research in production performance monitoring. Therefore, this article proposes an interactive visual analytics approach for monitoring the heavy-plate production process (iHPPPVis).
View Article and Find Full Text PDFThis article focuses on distributed nonconvex optimization by exchanging information between agents to minimize the average of local nonconvex cost functions. The communication channel between agents is normally constrained by limited bandwidth, and the gradient information is typically unavailable. To overcome these limitations, we propose a quantized distributed zeroth-order algorithm, which integrates the deterministic gradient estimator, the standard uniform quantizer, and the distributed gradient tracking algorithm.
View Article and Find Full Text PDFBilevel optimization is a special type of optimization in which one problem is embedded within another. The bilevel optimization problem (BLOP) of which both levels are multiobjective functions is usually called the multiobjective BLOP (MBLOP). The expensive computation and nested features make it challenging to solve.
View Article and Find Full Text PDFAiming at the operation optimization of the wastewater treatment process (WWTP) with nonstationary time-varying dynamics and complex multiconstraint, this article proposes a novel adaptive constraint penalty decomposed multiobjective evolutionary algorithm with synthetical distance (SD)-based cross-generation crossover. First, the concept of spatial SD is presented to comprehensively evaluate the similarity of individual solutions from two aspects of distance and angle, and the individual information between two adjacent generations is used to enhance the diversity of individuals and accelerate the convergence of the algorithm. Second, aiming at the complex multiconstraint during the operation optimization of WWTP, an adaptive penalty algorithm is further adopted to punish the individual solutions that violate the constraints, so as to improve the handling efficiency and success rate of constraints.
View Article and Find Full Text PDFIEEE Trans Cybern
August 2024
Electric-powered wheelchairs play a vital role in ensuring accessibility for individuals with mobility impairments. The design of controllers for tracking tasks must prioritize the safety of wheelchair operation across various scenarios and for a diverse range of users. In this study, we propose a safety-oriented speed tracking control algorithm for wheelchair systems that accounts for external disturbances and uncertain parameters at the dynamic level.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
November 2024
During the fused magnesia production process (FMPP), there is a demand peak phenomenon that the demand rises first and then falls. Once the demand exceeds its limit value, the power will be cut off. To avoid mistaken power off caused by demand peak, demand peak needs to be forecast, so multistep demand forecasting is required.
View Article and Find Full Text PDFExisting work on offline data-driven optimization mainly focuses on problems in static environments, and little attention has been paid to problems in dynamic environments. Offline data-driven optimization in dynamic environments is a challenging problem because the distribution of collected data varies over time, requiring surrogate models and optimal solutions tracking with time. This paper proposes a knowledge-transfer-based data-driven optimization algorithm to address these issues.
View Article and Find Full Text PDFThis article studies the trajectory imitation control problem of linear systems suffering external disturbances and develops a data-driven static output feedback (OPFB) control-based inverse reinforcement learning (RL) approach. An Expert-Learner structure is considered where the learner aims to imitate expert's trajectory. Using only measured expert's and learner's own input and output data, the learner computes the policy of the expert by reconstructing its unknown value function weights and thus, imitates its optimally operating trajectory.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
June 2024
This article is concerned with the fast and accurate trajectory tracking control problem for a sort of underactuated surface vehicle under model uncertainties and environmental disturbances. A novel neural networks (NNs)-based prescribed performance control strategy is proposed to solve the problem. In the control design, a new type of performance function is constructed which provides a way to predefine the settling time and accuracy, straightforward.
View Article and Find Full Text PDFIn this article, we develop two invariance principles for nonlinear discrete-time switched systems based on multiple Lyapunov functions and multiple weak Lyapunov functions, respectively, which allow the first differences of multiple weak Lyapunov functions to be positive on certain sets. It is shown that the solution of the system is attracted to the largest weakly invariant set in a certain specific region. Then, based on the invariance principle developed and geometrical dissipativity, we obtain the generalized output synchronization for discrete-time dynamical networks with nonidentical nodes by an appropriate switching among several communication topologies.
View Article and Find Full Text PDFThis article investigates the fault-tolerant coordinated tracking control problem for networked fixed-wing unmanned aerial vehicles (UAVs) against faults and communication delays. By supplementing the commonly used Gaussian functions in the fuzzy neural networks (FNNs) with sine-cosine functions and constructing two kinds of recurrent loops within the FNN architecture, double recurrent perturbation FNNs are cleverly designed to learn the unknown terms containing faults and uncertainties. Then, adaptive laws are designed for double recurrent perturbation FNNs.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
March 2024
In this article, a model-free Q-learning algorithm is proposed to solve the tracking problem of linear discrete-time systems with completely unknown system dynamics. To eliminate tracking errors, a performance index of the Q-learning approach is formulated, which can transform the tracking problem into a regulation one. Compared with the existing adaptive dynamic programming (ADP) methods and Q-learning approaches, the proposed performance index adds a product term composed of a gain matrix and the reference tracking trajectory to the control input quadratic form.
View Article and Find Full Text PDFIn this article, the event-triggered output regulation problem (EORP) under the denial-of-service (DoS) attacks is considered for networked switched systems (NSSs) with unstable switching dynamics (USDs). The USDs here refer to the unsolvable output regulation of each subsystem and the destabilization at partial switching instants, which indicates that the Lyapunov function does not decrease monotonically in activation intervals of each subsystem and increases at partial switching instants. First, long-duration DoS attacks (LDDAs) are considered, where LDDAs imply that their duration may be longer than the total dwell time (DT) of several adjacent activated subsystems.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
February 2024
This article proposes a data-driven inverse reinforcement learning (RL) control algorithm for nonzero-sum multiplayer games in linear continuous-time differential dynamical systems. The inverse RL problem in the games is solved by a learner reconstructing the unknown expert players' cost functions from demonstrated expert's optimal state and control input trajectories. The learner, thus, obtains the same control feedback gains and trajectories as the expert, only using data along system trajectories without knowing system dynamics.
View Article and Find Full Text PDFIEEE Trans Cybern
September 2023
This article investigates the sensitivity analysis (SA) of high-dimensional data to identify the effects of process variables on output quantity of interest (QoI) in industrial soft sensor modeling. The computational cost of analyzing the SA of high-dimensional data is high, and models available for SA techniques usually have limited generalization capacity. Therefore, we propose a novel high-dimensional data global SA (GSA) approach based on a deep soft sensor model to address these issues.
View Article and Find Full Text PDFThis article proposes a multiobjective operation optimization method based on reinforcement self-learning and knowledge guidance for quality assurance and consumption reduction of wastewater treatment process (WWTP) with nonstationary time-varying dynamics. First, operation optimization models are developed by online sequential random vector functional-link (OS-RVFL) neural network, which can realize online sequential learning of model parameters. Then, a knowledge base is established to store typical optimization cases for knowledge guiding the subsequent optimizations.
View Article and Find Full Text PDFIEEE Trans Image Process
November 2021
Typical learning-based light field reconstruction methods demand in constructing a large receptive field by deepening their networks to capture correspondences between input views. In this paper, we propose a spatial-angular attention network to perceive non-local correspondences in the light field, and reconstruct high angular resolution light field in an end-to-end manner. Motivated by the non-local attention mechanism (Wang et al.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
July 2023
This article develops two novel output feedback (OPFB) Q -learning algorithms, on-policy Q -learning and off-policy Q -learning, to solve H static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
August 2023
This article proposes new inverse reinforcement learning (RL) algorithms to solve our defined Adversarial Apprentice Games for nonlinear learner and expert systems. The games are solved by extracting the unknown cost function of an expert by a learner using demonstrated expert's behaviors. We first develop a model-based inverse RL algorithm that consists of two learning stages: an optimal control learning and a second learning based on inverse optimal control.
View Article and Find Full Text PDF