Principled estimation and evaluation of treatment effect heterogeneity: A case study application to dabigatran for patients with atrial fibrillation.

J Biomed Inform

Stanford Center for Biomedical Informatics Research, Stanford University, CA, United States of America; Department of Medicine, School of Medicine, Stanford University, Stanford, CA, United States of America; Clinical Excellence Research Center, Stanford University, Stanford, CA, United States of America; Technology and Digital Solutions, Stanford Healthcare, Stanford, CA, United States of America.

Published: July 2023

Objective: To apply the latest guidance for estimating and evaluating heterogeneous treatment effects (HTEs) in an end-to-end case study of the Long-term Anticoagulation Therapy (RE-LY) trial, and summarize the main takeaways from applying state-of-the-art metalearners and novel evaluation metrics in-depth to inform their applications to personalized care in biomedical research.

Methods: Based on the characteristics of the RE-LY data, we selected four metalearners (S-learner with Lasso, X-learner with Lasso, R-learner with random survival forest and Lasso, and causal survival forest) to estimate the HTEs of dabigatran. For the outcomes of (1) stroke or systemic embolism and (2) major bleeding, we compared dabigatran 150 mg, dabigatran 110 mg, and warfarin. We assessed the overestimation of treatment heterogeneity by the metalearners via a global null analysis and their discrimination and calibration ability using two novel metrics: rank-weighted average treatment effects (RATE) and estimated calibration error for treatment heterogeneity. Finally, we visualized the relationships between estimated treatment effects and baseline covariates using partial dependence plots.

Results: The RATE metric suggested that either the applied metalearners had poor performance of estimating HTEs or there was no treatment heterogeneity for either the stroke/SE or major bleeding outcome of any treatment comparison. Partial dependence plots revealed that several covariates had consistent relationships with the treatment effects estimated by multiple metalearners. The applied metalearners showed differential performance across outcomes and treatment comparisons, and the X- and R-learners yielded smaller calibration errors than the others.

Conclusions: HTE estimation is difficult, and a principled estimation and evaluation process is necessary to provide reliable evidence and prevent false discoveries. We have demonstrated how to choose appropriate metalearners based on specific data properties, applied them using the off-the-shelf implementation tool survlearners, and evaluated their performance using recently defined formal metrics. We suggest that clinical implications should be drawn based on the common trends across the applied metalearners.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10529425PMC
http://dx.doi.org/10.1016/j.jbi.2023.104420DOI Listing

Publication Analysis

Top Keywords

treatment heterogeneity
16
treatment effects
16
applied metalearners
12
treatment
10
principled estimation
8
estimation evaluation
8
case study
8
metalearners
8
survival forest
8
major bleeding
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!