Communication-based multiagent reinforcement learning (MARL) has shown promising results in promoting cooperation by enabling agents to exchange information. However, the existing methods have limitations in large-scale multiagent systems due to high information redundancy, and they tend to overlook the unstable training process caused by the online-trained communication protocol. In this work, we propose a novel method called neighboring variational information flow (NVIF), which enhances communication among neighboring agents by providing them with the maximum information set (MIS) containing more information than the existing methods. NVIF compresses the MIS into a compact latent state while adopting neighboring communication. To stabilize the overall training process, we introduce a two-stage training mechanism. We first pretrain the NVIF module using a randomly sampled offline dataset to create a task-agnostic and stable communication protocol, and then use the pretrained protocol to perform online policy training with RL algorithms. Our theoretical analysis indicates that NVIF-proximal policy optimization (PPO), which combines NVIF with PPO, has the potential to promote cooperation with agent-specific rewards. Experiment results demonstrate the superiority of our method in both heterogeneous and homogeneous settings. Additional experiment results also demonstrate the potential of our method for multitask learning.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TNNLS.2023.3309608 | DOI Listing |
J Phys Chem B
December 2024
Department of Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States.
Machine learning methods have been important in the study of phase transitions. Unsupervised methods are particularly attractive because they do not require prior knowledge of the existence of a phase transition. In this work we focus on the constant magnetization Ising model in two (2D) and three (3D) dimensions.
View Article and Find Full Text PDFComput Biol Med
January 2025
Institute for Digital Medicine, University Hospital Bonn, Venusberg-Campus 1, Bonn, 53127, North Rhine-Westphalia, Germany.
Wearable technology enables the unsupervised recording of electrocardiogram (ECG) signals. Analyzing these high-dimensional ECG data poses challenges regarding statistical approaches and explainability. This work investigates the feasibility of medically explainable anomaly detection through disentangled representational learning of ECGs and personalization to mitigate inter-subject variations.
View Article and Find Full Text PDFChaos
November 2024
College of Automation and Artificial Intelligence, Nanjing University of Posts and Telecommunications, Nanjing 210023, China.
In this paper, the state estimation problem of physical plants with unknown system dynamic is revisited from the perspective of limited output information measurement, which corresponds to those with characteristics of high-dimensional, wide-area coverage and scatter. Given this fact, a network of sensors are used to carry out the measurement with each one accessing only partial outputs of the targeted systems and a novel model-free state estimation approach, named distributed stochastic variational inference state estimation, is proposed. The key idea of this method is to compensate for the impacts of local output measurements by adding nearest-neighbor rule-based information interaction among estimators to complete the state estimation.
View Article and Find Full Text PDFInt Dent J
October 2024
Carlos-M. Ardila. DDS. Periodontist. Ph.D in Epidemiology. Postdoc in Bioethics Titular Professor. Universidad de Antioquia U de A, Medellín, Colombia. Biomedical Stomatology Research Group, Universidad de Antioquia U de A, Medellín, Colombia. Electronic address:
Biology (Basel)
September 2024
School of Science, Dalian Minzu University, Dalian 116600, China.
Single-cell RNA sequencing (scRNA-seq) is now a successful technology for identifying cell heterogeneity, revealing new cell subpopulations, and predicting developmental trajectories. A crucial component in scRNA-seq is the precise identification of cell subsets. Although many unsupervised clustering methods have been developed for clustering cell subpopulations, the performance of these methods is prone to be affected by dropout, high dimensionality, and technical noise.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!