Federated Reinforcement Learning for Training Control Policies on Multiple IoT Devices.

Hyun-Kyo Lim Ju-Bong Kim Joo-Seong Heo Youn-Hee Han

Sensors (Basel)

Department of Computer Science Engineering, Korea University of Technology and Education, Cheonan 31253, Korea.

Published: March 2020

Reinforcement learning has recently been studied in various fields and also used to optimally control IoT devices supporting the expansion of Internet connection beyond the usual standard devices. In this paper, we try to allow multiple reinforcement learning agents to learn optimal control policy on their own IoT devices of the same type but with slightly different dynamics. For such multiple IoT devices, there is no guarantee that an agent who interacts only with one IoT device and learns the optimal control policy will also control another IoT device well. Therefore, we may need to apply independent reinforcement learning to each IoT device individually, which requires a costly or time-consuming effort. To solve this problem, we propose a new federated reinforcement learning architecture where each agent working on its independent IoT device shares their learning experience (i.e., the gradient of loss function) with each other, and transfers a mature policy model parameters into other agents. They accelerate its learning process by using mature parameters. We incorporate the actor-critic proximal policy optimization (Actor-Critic PPO) algorithm into each agent in the proposed collaborative architecture and propose an efficient procedure for the gradient sharing and the model transfer. Using multiple rotary inverted pendulum devices interconnected via a network switch, we demonstrate that the proposed federated reinforcement learning scheme can effectively facilitate the learning process for multiple IoT devices and that the learning speed can be faster if more agents are involved.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7085801	PMC
http://dx.doi.org/10.3390/s20051359	DOI Listing

Publication Analysis

Top Keywords

reinforcement learning

iot devices

iot device

federated reinforcement

multiple iot

learning

iot

control iot

optimal control

control policy

Similar Publications

Regulation of Dopamine Release by Tonic Activity Patterns in the Striatal Brain Slice.

ACS Chem Neurosci

January 2025

Departments of Psychiatry and Neurology, Division of Molecular Therapeutics, New York State Psychiatric Institute, Columbia University Medical Center, New York, New York 10032, United States.

Siham Boumhaouad Emily A Makowicz Sejoon Choi Nezha Bouhaddou Jihane Balla

Voluntary movement, motivation, and reinforcement learning depend on the activity of ventral midbrain neurons, which extend axons to release dopamine (DA) in the striatum. These neurons exhibit two patterns of action potential activity: low-frequency tonic activity that is intrinsically generated and superimposed high-frequency phasic bursts that are driven by synaptic inputs. acute striatal brain preparations are widely employed to study the regulation of evoked DA release but exhibit very different DA release kinetics than recordings.

View Article and Find Full Text PDF

Similar Publications

"Listening to understand," exploring postpartum women's perceptions of their social networks and social support in relation to their health behaviors and weight: A qualitative exploratory study.

Womens Health (Lond)

January 2025

College of Nursing, University of Utah, Salt Lake City, UT, USA.

Jacqueline Kent-Marvick Kristin G Cloyes Ana Clark Monica Angulo Kayla de la Haye

Background: Postpartum is a critical period to interrupt weight gain across the lifespan, decrease weight-related risk in future pregnancies, promote healthy behaviors that are often adopted during pregnancy, and improve long-term health. Because the postpartum period is marked by unique challenges to a person's ability to prioritize healthy behaviors, a multi-level/domain approach to intervention beyond the individual-level factors of diet and activity is needed.

Objectives: The purpose of this study was to understand postpartum people's perceptions about the relationship between their social networks and support, and their health behaviors and weight.

View Article and Find Full Text PDF

Similar Publications

Analysis of Autonomous Penetration Testing Through Reinforcement Learning and Recommender Systems.

Sensors (Basel)

January 2025

Group of Analysis, Security and Systems (GASS), Department of Software Engineering and Artificial Intelligence (DISIA), Faculty of Computer Science and Engineering, Office 431, Universidad Complutense de Madrid (UCM), Calle Profesor José García Santesmases, 9, Ciudad Universitaria, 28040 Madrid, Spain.

Ariadna Claudia Moreno Aldo Hernandez-Suarez Gabriel Sanchez-Perez Linda Karina Toscano-Medina Hector Perez-Meana

Conducting penetration testing (pentesting) in cybersecurity is a crucial turning point for identifying vulnerabilities within the framework of Information Technology (IT), where real malicious offensive behavior is simulated to identify potential weaknesses and strengthen preventive controls. Given the complexity of the tests, time constraints, and the specialized level of expertise required for pentesting, analysis and exploitation tools are commonly used. Although useful, these tools often introduce uncertainty in findings, resulting in high rates of false positives.

View Article and Find Full Text PDF

Similar Publications

Knowledge Distillation-Enhanced Behavior Transformer for Decision-Making of Autonomous Driving.

Sensors (Basel)

January 2025

Key Laboratory of Automotive Power Train and Electronics, Hubei University of Automotive Technology, Shiyan 442002, China.

Rui Zhao Yuze Fan Yun Li Dong Zhang Fei Gao

Autonomous driving has demonstrated impressive driving capabilities, with behavior decision-making playing a crucial role as a bridge between perception and control. Imitation Learning (IL) and Reinforcement Learning (RL) have introduced innovative approaches to behavior decision-making in autonomous driving, but challenges remain. On one hand, RL's policy networks often lack sufficient reasoning ability to make optimal decisions in highly complex and stochastic environments.

View Article and Find Full Text PDF

Similar Publications

Task Offloading with LLM-Enhanced Multi-Agent Reinforcement Learning in UAV-Assisted Edge Computing.

Sensors (Basel)

December 2024

School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China.

Feifan Zhu Fei Huang Yantao Yu Guojin Liu Tiancong Huang

Unmanned aerial vehicles (UAVs) furnished with computational servers enable user equipment (UE) to offload complex computational tasks, thereby addressing the limitations of edge computing in remote or resource-constrained environments. The application of value decomposition algorithms for UAV trajectory planning has drawn considerable research attention. However, existing value decomposition algorithms commonly encounter obstacles in effectively associating local observations with the global state of UAV clusters, which hinders their task-solving capabilities and gives rise to reduced task completion rates and prolonged convergence times.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!