Imputation-based Q-learning for optimizing dynamic treatment regimes with right-censored survival outcome.

Biometrics

Departments of Biostatistics and Computational Biology, University of Rochester, Rochester, New York, USA.

Published: December 2023

Q-learning has been one of the most commonly used methods for optimizing dynamic treatment regimes (DTRs) in multistage decision-making. Right-censored survival outcome poses a significant challenge to Q-Learning due to its reliance on parametric models for counterfactual estimation which are subject to misspecification and sensitive to missing covariates. In this paper, we propose an imputation-based Q-learning (IQ-learning) where flexible nonparametric or semiparametric models are employed to estimate optimal treatment rules for each stage and then weighted hot-deck multiple imputation (MI) and direct-draw MI are used to predict optimal potential survival times. Missing data are handled using inverse probability weighting and MI, and the nonrandom treatment assignment among the observed is accounted for using a propensity-score approach. We investigate the performance of IQ-learning via extensive simulations and show that it is more robust to model misspecification than existing Q-Learning methods, imputes only plausible potential survival times contrary to parametric models and provides more flexibility in terms of baseline hazard shape. Using IQ-learning, we developed an optimal DTR for leukemia treatment based on a randomized trial with observational follow-up that motivated this study.

Download full-text PDF

Source
http://dx.doi.org/10.1111/biom.13872DOI Listing

Publication Analysis

Top Keywords

imputation-based q-learning
8
optimizing dynamic
8
dynamic treatment
8
treatment regimes
8
right-censored survival
8
survival outcome
8
parametric models
8
potential survival
8
survival times
8
treatment
5

Similar Publications

Imputation-based Q-learning for optimizing dynamic treatment regimes with right-censored survival outcome.

Biometrics

December 2023

Departments of Biostatistics and Computational Biology, University of Rochester, Rochester, New York, USA.

Q-learning has been one of the most commonly used methods for optimizing dynamic treatment regimes (DTRs) in multistage decision-making. Right-censored survival outcome poses a significant challenge to Q-Learning due to its reliance on parametric models for counterfactual estimation which are subject to misspecification and sensitive to missing covariates. In this paper, we propose an imputation-based Q-learning (IQ-learning) where flexible nonparametric or semiparametric models are employed to estimate optimal treatment rules for each stage and then weighted hot-deck multiple imputation (MI) and direct-draw MI are used to predict optimal potential survival times.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!