There is a growing interest in using reinforcement learning (RL) to personalize sequences of treatments in digital health to support users in adopting healthier behaviors. Such sequential decision-making problems involve decisions about when to treat and how to treat based on the user's context (e.g., prior activity level, location, etc.). Online RL is a promising datadriven approach for this problem as it learns based on each user's historical responses and uses that knowledge to personalize these decisions. However, to decide whether the RL algorithm should be included in an "optimized" intervention for real-world deployment, we must assess the data evidence indicating that the RL algorithm is actually personalizing the treatments to its users. Due to the stochasticity in the RL algorithm, one may get a false impression that it is learning in certain states and using this learning to provide specific treatments. We use a working definition of personalization and introduce a resampling-based methodology for investigating whether the personalization exhibited by the RL algorithm is an artifact of the RL algorithm stochasticity. We illustrate our methodology with a case study by analyzing the data from a physical activity clinical trial called HeartSteps, which included the use of an online RL algorithm. We demonstrate how our approach enhances data-driven truth-in-advertising of algorithm personalization both across all users as well as within specific users in the study.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11364365 | PMC |
http://dx.doi.org/10.1007/s10994-024-06526-x | DOI Listing |
NPJ Digit Med
January 2025
Graduate School of Data Science, Seoul National University, Seoul, Republic of Korea.
Polysomnography (PSG) is crucial for diagnosing sleep disorders, but manual scoring of PSG is time-consuming and subjective, leading to high variability. While machine-learning models have improved PSG scoring, their clinical use is hindered by the 'black-box' nature. In this study, we present SleepXViT, an automatic sleep staging system using Vision Transformer (ViT) that provides intuitive, consistent explanations by mimicking human 'visual scoring'.
View Article and Find Full Text PDFTransl Psychiatry
January 2025
Division of Psychology, Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden.
Background: Alcohol use disorder (AUD) is associated with deficits in social cognition and behavior, but why these deficits are acquired is unknown. We hypothesized that a reduced association between actions and outcomes for others, i.e.
View Article and Find Full Text PDFSci Rep
January 2025
School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, India.
This study introduces a novel ensemble learning technique namely Multi-Armed Bandit Ensemble (MAB-Ensemble), designed for lane detection in road images intended for autonomous vehicles. The foundation of the proposed MAB-Ensemble technique is inspired in terms of Multi-Armed bandit optimization to facilitate efficient model selection for lane segmentation. The benchmarking dataset namely TuSimple is used for training, validating and testing the proposed and existing lane detection techniques.
View Article and Find Full Text PDFISA Trans
January 2025
Toronto Metropolitan University, Toronto, Canada. Electronic address:
This research introduces an innovative approach to optimal control for a class of linear systems with input saturation. It leverages the synergy of Takagi-Sugeno (T-S) fuzzy models and reinforcement learning (RL) techniques. To enhance interpretability and analytical accessibility, our approach applies T-S models to approximate the value function and generate optimal control laws while incorporating prior knowledge.
View Article and Find Full Text PDFBrain Res
January 2025
Department of Computing Science, University of Alberta Edmonton Alberta Canada; Alberta Machine Intelligence Institute Edmonton Alberta Canada; Canada Institute for Advanced Research (CIFAR) AI Chair, Canada.
Humans are excellent at modifying our behaviour depending on context. For example, humans will change how they explore when losses are possible compared to when they are not possible. However, it remains unclear what specific cognitive and neural processes are modulated when exploring in different contexts.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!