A comparison of random forest-based missing imputation methods for covariates in propensity score analysis.

Psychol Methods

School of Human Development and Organizational Studies in Education, University of Florida.

Published: June 2024

Propensity score analysis (PSA) is a prominent method to alleviate selection bias in observational studies, but missing data in covariates is prevalent and must be dealt with during propensity score estimation. Through Monte Carlo simulations, this study evaluates the use of imputation methods based on multiple random forests algorithms to handle missing data in covariates: multivariate imputation by chained equations-random forest (Caliber), proximity imputation (PI), and missForest. The results indicated that PI and missForest outperformed other methods with respect to bias of average treatment effect regardless of sample size and missing mechanisms. A demonstration of these five methods with PSA to evaluate the effect of participation in center-based care on children's reading ability is provided using data from the Early Childhood Longitudinal Study, Kindergarten Class of 2010-2011. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

Download full-text PDF

Source
http://dx.doi.org/10.1037/met0000676DOI Listing

Publication Analysis

Top Keywords

propensity score
12
imputation methods
8
score analysis
8
missing data
8
data covariates
8
comparison random
4
random forest-based
4
missing
4
forest-based missing
4
imputation
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!