Stochastic calculus-guided reinforcement learning: A probabilistic framework for optimal decision-making.

MethodsX

Department of Computer Science and Engineering, GITAM School of Technology, GITAM University, Bengaluru, India.

Published: June 2024

Stochastic Calculus-guided Reinforcement learning (SCRL) is a new way to make decisions in situations where things are uncertain. It uses mathematical principles to make better choices and improve decision-making in complex situations. SCRL works better than traditional Stochastic Reinforcement Learning (SRL) methods. In tests, SCRL showed that it can adapt and perform well. It was better than the SRL methods. SCRL had a lower dispersion value of 63.49 compared to SRL's 65.96. This means SCRL had less variation in its results. SCRL also had lower risks than SRL in the short- and long-term. SCRL's short-term risk value was 0.64, and its long-term risk value was 0.78. SRL's short-term risk value was much higher at 18.64, and its long-term risk value was 10.41. Lower risk values are better because they mean less chance of something going wrong. Overall, SCRL is a better way to make decisions when things are uncertain. It uses math to make smarter choices and has less risk than other methods. Also, different metrics, viz training rewards, learning progress, and rolling averages between SRL and SCRL, were assessed, and the study found that SCRL outperforms well compared to SRL. This makes SCRL very useful for real-world situations where decisions must be made carefully.•By leveraging mathematical principles derived from stochastic calculus, SCRL offers a robust framework for making informed choices and enhancing performance in complex scenarios.•In comparison to traditional SRL methods, SCRL demonstrates superior adaptability and efficacy, as evidenced by empirical tests.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11223108PMC
http://dx.doi.org/10.1016/j.mex.2024.102790DOI Listing

Publication Analysis

Top Keywords

reinforcement learning
12
scrl
12
srl methods
12
stochastic calculus-guided
8
calculus-guided reinforcement
8
things uncertain
8
mathematical principles
8
methods scrl
8
scrl lower
8
short-term risk
8

Similar Publications

Regulation of Dopamine Release by Tonic Activity Patterns in the Striatal Brain Slice.

ACS Chem Neurosci

January 2025

Departments of Psychiatry and Neurology, Division of Molecular Therapeutics, New York State Psychiatric Institute, Columbia University Medical Center, New York, New York 10032, United States.

Voluntary movement, motivation, and reinforcement learning depend on the activity of ventral midbrain neurons, which extend axons to release dopamine (DA) in the striatum. These neurons exhibit two patterns of action potential activity: low-frequency tonic activity that is intrinsically generated and superimposed high-frequency phasic bursts that are driven by synaptic inputs. acute striatal brain preparations are widely employed to study the regulation of evoked DA release but exhibit very different DA release kinetics than recordings.

View Article and Find Full Text PDF

Background: Postpartum is a critical period to interrupt weight gain across the lifespan, decrease weight-related risk in future pregnancies, promote healthy behaviors that are often adopted during pregnancy, and improve long-term health. Because the postpartum period is marked by unique challenges to a person's ability to prioritize healthy behaviors, a multi-level/domain approach to intervention beyond the individual-level factors of diet and activity is needed.

Objectives: The purpose of this study was to understand postpartum people's perceptions about the relationship between their social networks and support, and their health behaviors and weight.

View Article and Find Full Text PDF

Analysis of Autonomous Penetration Testing Through Reinforcement Learning and Recommender Systems.

Sensors (Basel)

January 2025

Group of Analysis, Security and Systems (GASS), Department of Software Engineering and Artificial Intelligence (DISIA), Faculty of Computer Science and Engineering, Office 431, Universidad Complutense de Madrid (UCM), Calle Profesor José García Santesmases, 9, Ciudad Universitaria, 28040 Madrid, Spain.

Conducting penetration testing (pentesting) in cybersecurity is a crucial turning point for identifying vulnerabilities within the framework of Information Technology (IT), where real malicious offensive behavior is simulated to identify potential weaknesses and strengthen preventive controls. Given the complexity of the tests, time constraints, and the specialized level of expertise required for pentesting, analysis and exploitation tools are commonly used. Although useful, these tools often introduce uncertainty in findings, resulting in high rates of false positives.

View Article and Find Full Text PDF

Autonomous driving has demonstrated impressive driving capabilities, with behavior decision-making playing a crucial role as a bridge between perception and control. Imitation Learning (IL) and Reinforcement Learning (RL) have introduced innovative approaches to behavior decision-making in autonomous driving, but challenges remain. On one hand, RL's policy networks often lack sufficient reasoning ability to make optimal decisions in highly complex and stochastic environments.

View Article and Find Full Text PDF

Task Offloading with LLM-Enhanced Multi-Agent Reinforcement Learning in UAV-Assisted Edge Computing.

Sensors (Basel)

December 2024

School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China.

Unmanned aerial vehicles (UAVs) furnished with computational servers enable user equipment (UE) to offload complex computational tasks, thereby addressing the limitations of edge computing in remote or resource-constrained environments. The application of value decomposition algorithms for UAV trajectory planning has drawn considerable research attention. However, existing value decomposition algorithms commonly encounter obstacles in effectively associating local observations with the global state of UAV clusters, which hinders their task-solving capabilities and gives rise to reduced task completion rates and prolonged convergence times.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!