Learning distributed cooperative policies for large-scale multirobot systems remains a challenging task in the multiagent reinforcement learning (MARL) context. In this work, we model the interactions among the robots as a graph and propose a novel off-policy actor-critic MARL algorithm to train distributed coordination policies on the graph by leveraging the ability of information extraction of graph neural networks (GNNs). First, a new type of Gaussian policy parameterized by the GNNs is designed for distributed decision-making in continuous action spaces. Second, a scalable centralized value function network is designed based on a novel GNN-based value function decomposition technique. Then, based on the designed actor and the critic networks, a GNN-based MARL algorithm named graph soft actor-critic (G-SAC) is proposed and utilized to train the distributed policies in an effective and centralized fashion. Finally, two custom multirobot coordination environments are built, under which the simulation results are performed to empirically demonstrate both the sample efficiency and the scalability of G-SAC as well as the strong zero-shot generalization ability of the trained policy in large-scale multirobot coordination problems.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TNNLS.2023.3329530 | DOI Listing |
Sensors (Basel)
October 2024
Department of Computer Engineering, College of Engineering, Al Yamamah University, Riyadh 11512, Saudi Arabia.
This paper presents a comprehensive framework for mission planning and execution with a heterogeneous multi-robot system, specifically designed to coordinate unmanned ground vehicles (UGVs) and unmanned aerial vehicles (UAVs) in dynamic and unstructured environments. The proposed architecture evaluates the mission requirements, allocates tasks, and optimizes resource usage based on the capabilities of the available robots. It then executes the mission utilizing a decentralized control strategy that enables the robots to adapt to environmental changes and maintain formation stability in both 2D and 3D spaces.
View Article and Find Full Text PDFiScience
May 2024
School of Aerospace Science and Technology, Xidian University, Xi'an 710126, China.
This article designs and implements a fast and high-precision multi-robot environment modeling method based on bidirectional filtering and scene identification. To solve the problem of feature tracking failure caused by large angle rotation, a bidirectional filtering mechanism is introduced to improve the error-matching elimination algorithm. A global key frame database for multiple robots is proposed based on a pretraining dictionary to convert images into a bag of words vectors.
View Article and Find Full Text PDFISA Trans
June 2024
Northwestern Polytechnical University, 127 Youyi Road, Xi'an, 710072, Shaanxi, China.
This paper investigates the approximate optimal coordination for nonlinear uncertain second-order multi-robot systems with guaranteed safety (collision avoidance) Through constructing novel local error signals, the collision-free control objective is formulated into an coordination optimization problem for nominal multi-robot systems. Based on approximate dynamic programming technique, the optimal value functions and control policies are learned by simplified critic-only neural networks (NNs). Then, the approximated optimal controllers are redesigned using adaptive law to handle the effects of robots' uncertain dynamics.
View Article and Find Full Text PDFBiomimetics (Basel)
February 2024
Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China.
In recent years, an increasing number of studies have focused on exploring the principles and mechanisms underlying the emergence of collective intelligence in biological populations, aiming to provide insights for human society and the engineering field. Pigeon flock behavior garners significant attention as a subject of study. Collective homing flight is a commonly observed behavioral pattern in pigeon flocks.
View Article and Find Full Text PDFFront Neurorobot
January 2024
School of Business, Wuchang University of Technology, Wuhan, Hubei, China.
Introduction: In the field of logistics warehousing robots, collaborative operation and coordinated control have always been challenging issues. Although deep learning and reinforcement learning methods have made some progress in solving these problems, however, current research still has shortcomings. In particular, research on adaptive sensing and real-time decision-making of multi-robot swarms has not yet received sufficient attention.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!