The EM algorithm can be used when a data set has missing data elements. The missing data is estimated using an iterative process where each iteration consists of two steps: (1) an M step (maximization) where parameters are calculated based on the missing data results from the previous E step (or via a guess in the initial iteration) and (2) an E step (expectation) where each missing data is estimated from the parameters in the previous M step.
Topics
- Multivariate normally distributed data
- Independence Testing
References
Efron and Hastie (2016) Computer age statistical inference. Cambridge University Press.
https://moodle2.units.it/pluginfile.php/340060/mod_resource/content/1/casi.pdf
Walczak, B., Massart, (2001) Dealing with missing data: Part II. Chemometrics and Intelligent Laboratory Systems 58 Ž2001. 29–42
https://www.academia.edu/59642526/Dealing_with_missing_data
Raghunathan, T. (2016) Missing data analysis in practice. CRC Press
https://www.taylorfrancis.com/books/mono/10.1201/b19428/missing-data-analysis-practice-trivellore-raghunathan
Morse, C. (2013) EM algorithm
No longer available online
Hi – is there a big difference between EM and FIML methods for inputting missing data? Thanks!
Hi Emma,
I don’t know for sure, but I would expect that in general there wouldn’t be a big difference, but it would depend on the details (e.g. how much data is missing).
Charles
Nice site