As described in Traditional Approaches for Handling Missing Data, single imputation approaches result in inaccurate values for the mean or the variance or covariance matrix, depending on the specific technique used. Multiple imputation provides a way to get around these difficulties by generating multiple imputations with a random component and then combining the results. In this way, MI creates values for the missing data that preserve the inherent characteristics of the variables (means, variance, etc.).
Topics
- Overview
- Fully conditional specification (FCS)
- Frequency and patterns of missing data
- Simple Imputation and Constraints
- One step of the FCS procedure
- One complete imputation using FCS
- Combining the results of multiple imputations
- Number of imputations
- Multiple regression with missing data
References
Haymans, M. W., Eekhout, I (2019) Applied missing data analysis with SPSS and (R) Studio
https://bookdown.org/mwheymans/bookmi/
UCLA (2021) How do I perform multiple imputation using predictive mean matching in R
https://stats.oarc.ucla.edu/r/faq/how-do-i-perform-multiple-imputation-using-predictive-mean-matching-in-r/
Murray, J. S. (2018) Multiple imputation: a review of practical and theoretical findings
https://projecteuclid.org/journals/statistical-science/volume-33/issue-2/Multiple-Imputation-A-Review-of-Practical-and-Theoretical-Findings/10.1214/18-STS644.full
Woods, A. D. et al. (2021) Missing data and multiple imputation decision tree. PsyArXiv
https://doi.org/10.31234/osf.io/mdw5r
Tufis, C. (2008) Multiple imputation as a solution to the missing data problem in social sciences
https://www.revistacalitateavietii.ro/journal/article/download/538/458/883
Gelman, A., Hill, J. (2006) Data analysis using regression and multilevel/hierarchical
https://github.com/bgse-datascience-group8/Statistical-Modelling-and-Inference/blob/master/resources/Gelman%2C%20Hill-Data%20Analysis%20Using%20Regression%20(2007).pdf
Dear Charles,
Can this approach deal with variables with different units, i.e., variables that measure completely different phenomena?
Cesar,
I believe that multiple imputation can be used with variables in different units or that measure different phenomena. The approach is counting on the fact that there is an association between the variable (or variables) with missing data and the other variables.
Charles