Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling.

Susobhan Ghosh Raphael Kim Prasidh Chhabria Raaz Dwivedi Predrag Klasnja Peng Liao Kelly Zhang Susan Murphy

Mach Learn

Department of Computer Science, Harvard University.

Published: July 2024

There is a growing interest in using reinforcement learning (RL) to personalize sequences of treatments in digital health to support users in adopting healthier behaviors. Such sequential decision-making problems involve decisions about when to treat and how to treat based on the user's context (e.g., prior activity level, location, etc.). Online RL is a promising datadriven approach for this problem as it learns based on each user's historical responses and uses that knowledge to personalize these decisions. However, to decide whether the RL algorithm should be included in an "optimized" intervention for real-world deployment, we must assess the data evidence indicating that the RL algorithm is actually personalizing the treatments to its users. Due to the stochasticity in the RL algorithm, one may get a false impression that it is learning in certain states and using this learning to provide specific treatments. We use a working definition of personalization and introduce a resampling-based methodology for investigating whether the personalization exhibited by the RL algorithm is an artifact of the RL algorithm stochasticity. We illustrate our methodology with a case study by analyzing the data from a physical activity clinical trial called HeartSteps, which included the use of an online RL algorithm. We demonstrate how our approach enhances data-driven truth-in-advertising of algorithm personalization both across all users as well as within specific users in the study.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11364365	PMC
http://dx.doi.org/10.1007/s10994-024-06526-x	DOI Listing

Publication Analysis

Top Keywords

reinforcement learning

algorithm

based user's

personalize? assessing

personalization

assessing personalization

personalization online

online reinforcement

learning

learning algorithm

Similar Publications

Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals.

NPJ Digit Med

January 2025

Graduate School of Data Science, Seoul National University, Seoul, Republic of Korea.

Hyojin Lee You Rim Choi Hyun Kyung Lee Jaemin Jeong Joopyo Hong

Polysomnography (PSG) is crucial for diagnosing sleep disorders, but manual scoring of PSG is time-consuming and subjective, leading to high variability. While machine-learning models have improved PSG scoring, their clinical use is hindered by the 'black-box' nature. In this study, we present SleepXViT, an automatic sleep staging system using Vision Transformer (ViT) that provides intuitive, consistent explanations by mimicking human 'visual scoring'.

View Article and Find Full Text PDF

Similar Publications

Dissociating social reward learning and behavior in alcohol use disorder.

Transl Psychiatry

January 2025

Division of Psychology, Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden.

Simon Jangard Björn Lindström Lotfi Khemiri Nitya Jayaram-Lindström Andreas Olsson

Background: Alcohol use disorder (AUD) is associated with deficits in social cognition and behavior, but why these deficits are acquired is unknown. We hypothesized that a reduced association between actions and outcomes for others, i.e.

View Article and Find Full Text PDF

Similar Publications

Enhancing lane detection in autonomous vehicles with multi-armed bandit ensemble learning.

Sci Rep

January 2025

School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, India.

J Arun Pandian Ramkumar Thirunavukarasu L Thanga Mariappan

This study introduces a novel ensemble learning technique namely Multi-Armed Bandit Ensemble (MAB-Ensemble), designed for lane detection in road images intended for autonomous vehicles. The foundation of the proposed MAB-Ensemble technique is inspired in terms of Multi-Armed bandit optimization to facilitate efficient model selection for lane segmentation. The benchmarking dataset namely TuSimple is used for training, validating and testing the proposed and existing lane detection techniques.

View Article and Find Full Text PDF

Similar Publications

Fuzzy reinforcement learning based control of linear systems with input saturation.

ISA Trans

January 2025

Toronto Metropolitan University, Toronto, Canada. Electronic address:

Kainan Liu Xiaojun Ban Shengkun Xie

This research introduces an innovative approach to optimal control for a class of linear systems with input saturation. It leverages the synergy of Takagi-Sugeno (T-S) fuzzy models and reinforcement learning (RL) techniques. To enhance interpretability and analytical accessibility, our approach applies T-S models to approximate the value function and generate optimal control laws while incorporating prior knowledge.

View Article and Find Full Text PDF

Similar Publications

Electrophysiological signatures of the effect of context on exploration: Greater attentional and learning signals when exploration is costly.

Brain Res

January 2025

Department of Computing Science, University of Alberta Edmonton Alberta Canada; Alberta Machine Intelligence Institute Edmonton Alberta Canada; Canada Institute for Advanced Research (CIFAR) AI Chair, Canada.

Thomas D Ferguson Alona Fyshe Adam White

Humans are excellent at modifying our behaviour depending on context. For example, humans will change how they explore when losses are possible compared to when they are not possible. However, it remains unclear what specific cognitive and neural processes are modulated when exploring in different contexts.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!