Multiple Imputation (MI)

As described in Traditional Approaches for Handling Missing Data, single imputation approaches result in inaccurate values for the mean or the variance or covariance matrix, depending on the specific technique used. Multiple imputation provides a way to get around these difficulties by generating multiple imputations with a random component and then combining the results. In this way, MI creates values for the missing data that preserve the inherent characteristics of the variables (means, variance, etc.).

Topics

References

Haymans, M. W., Eekhout, I (2019) Applied missing data analysis with SPSS and (R) Studio
https://bookdown.org/mwheymans/bookmi/

UCLA (2021) How do I perform multiple imputation using predictive mean matching in R
https://stats.oarc.ucla.edu/r/faq/how-do-i-perform-multiple-imputation-using-predictive-mean-matching-in-r/

Murray, J. S. (2018) Multiple imputation: a review of practical and theoretical findings
https://projecteuclid.org/journals/statistical-science/volume-33/issue-2/Multiple-Imputation-A-Review-of-Practical-and-Theoretical-Findings/10.1214/18-STS644.full

Woods, A. D. et al. (2021) Missing data and multiple imputation decision tree. PsyArXiv
https://doi.org/10.31234/osf.io/mdw5r

Tufis, C. (2008) Multiple imputation as a solution to the missing data problem in social sciences
https://www.revistacalitateavietii.ro/journal/article/download/538/458/883

Gelman, A., Hill, J. (2006) Data analysis using regression and multilevel/hierarchical
https://github.com/bgse-datascience-group8/Statistical-Modelling-and-Inference/blob/master/resources/Gelman%2C%20Hill-Data%20Analysis%20Using%20Regression%20(2007).pdf

2 thoughts on “Multiple Imputation (MI)”

  1. Dear Charles,

    Can this approach deal with variables with different units, i.e., variables that measure completely different phenomena?

    Reply
    • Cesar,
      I believe that multiple imputation can be used with variables in different units or that measure different phenomena. The approach is counting on the fact that there is an association between the variable (or variables) with missing data and the other variables.
      Charles

      Reply

Leave a Comment