Strangeness-driven exploration in multi-agent reinforcement learning.

Neural Netw

Future Convergence Engineering, Department of Computer Science and Engineering, Korea University of Technology and Education, Cheonan, 31253, Republic of Korea. Electronic address:

Published: April 2024

In this study, a novel exploration method for centralized training and decentralized execution (CTDE)-based multi-agent reinforcement learning (MARL) is introduced. The method uses the concept of strangeness, which is determined by evaluating (1) the level of the unfamiliarity of the observations an agent encounters and (2) the level of the unfamiliarity of the entire state the agents visit. An exploration bonus, which is derived from the concept of strangeness, is combined with the extrinsic reward obtained from the environment to form a mixed reward, which is then used for training CTDE-based MARL algorithms. Additionally, a separate action-value function is also proposed to prevent the high exploration bonus from overwhelming the sensitivity to extrinsic rewards during MARL training. This separate function is used to design the behavioral policy for generating transitions. The proposed method is not much affected by stochastic transitions commonly observed in MARL tasks and improves the stability of CTDE-based MARL algorithms when used with an exploration method. By providing didactic examples and demonstrating the substantial performance improvement of our proposed exploration method in CTDE-based MARL algorithms, we illustrate the advantages of our approach. These evaluations highlight how our method outperforms state-of-the-art MARL baselines on challenging tasks within the StarCraft II micromanagement benchmark, underscoring its effectiveness in improving MARL.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neunet.2024.106149DOI Listing

Publication Analysis

Top Keywords

exploration method
12
ctde-based marl
12
marl algorithms
12
multi-agent reinforcement
8
reinforcement learning
8
marl
8
concept strangeness
8
level unfamiliarity
8
exploration bonus
8
method
6

Similar Publications

Objective: This study investigates the relationship between the albumin-to-creatinine ratio and diabetic retinopathy (DR) in US adults using NHANES data from 2009 to 2016. This study assesses the predictive efficacy of the urinary serum albumin-to-creatinine ratio (UACR/SACR Ratio) against traditional biomarkers such as the serum albumin-to-creatinine ratio (SACR) and urinary albumin-to-creatinine ratio (UACR) for evaluating DR risk. Additionally, the study explores the potential of these biomarkers, both individually and in combination with HbA1c, for early detection and risk stratification of DR.

View Article and Find Full Text PDF

TRPV4 as a Novel Regulator of Ferroptosis in Colon Adenocarcinoma: Implications for Prognosis and Therapeutic Targeting.

Dig Dis Sci

January 2025

Ningxia Medical University, Xing Qing Block, Shengli Street No.1160, Yin Chuan City, 750004, Ningxia Province, People's Republic of China.

Background: Colon adenocarcinoma (COAD) is a leading cause of cancer-related mortality worldwide. Transient receptor potential vanilloid 4 (TRPV4), a calcium-permeable non-selective cation channel, has been implicated in various cancers, including COAD. This study investigates the role of TRPV4 in colon adenocarcinoma and elucidates its potential mechanism via the ferroptosis pathway.

View Article and Find Full Text PDF

Background: Ulcerative colitis patients who undergo ileal pouch-anal anastomosis (IPAA) without mucosectomy may develop inflammation of the rectal cuff (cuffitis). Treatment of cuffitis typically includes mesalamine suppositories or corticosteroids, but refractory cuffitis may necessitate advanced therapies or procedural interventions. This review aims to summarize the existing literature regarding treatments options for cuffitis.

View Article and Find Full Text PDF

Purpose: To review the current evidence on the association between salivary protein profile and dental caries in children during mixed dentition stage.

Methods: This systematic review followed the PRISMA 2020 guidelines. Searches were run in PubMed, Scopus and Embase along with gray literature.

View Article and Find Full Text PDF

Introduction/objectives: Sjogren's syndrome (SS) is a chronic inflammatory and difficult-to-treat autoimmune disease. Timosaponin AIII (TAIII), a plant-derived steroidal saponin, effectively inhibits cell proliferation, induces apoptosis, and exhibits anti-inflammatory properties. This study explored the mechanisms of action of TAIII in SS treatment by studying gut microbiota and short-chain fatty acids (SCFAs) using fecal metabolomics.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!