Publication type
Journal Article
Authors
Publication date
March 2, 2021
Summary:
Identifying predictors of attrition is essential for designing longitudinal studies such that attrition bias can be minimised, and for identifying the variables that can be used as auxiliary in statistical techniques to help correct for non-random drop-out. This paper provides a comparative overview of predictive techniques that can be used to model attrition and identify important risk factors that help in its prediction. Logistic regression and several tree-based machine learning methods were applied to Wave 2 dropout in an illustrative sample of 5000 individuals from a large UK longitudinal study, Understanding Society. Each method was evaluated based on accuracy, AUC-ROC, plausibility of key assumptions and interpretability. Our results suggest a 10% improvement in accuracy for random forest compared to logistic regression methods. However, given the differences in estimation procedures we suggest that both models could be used in conjunction to provide the most comprehensive understanding of attrition predictors.
Published in
SocArXiv
DOI
https://doi.org/10.31235/osf.io/tyszr
Subjects
Notes
Open Access
CC-By Attribution-NonCommercial-NoDerivatives 4.0 International
#536709