We start by creating a logistic regression model for Y based on the data in X using only the complete rows of X and Y (i.e. using listwise deletion). The result is a column vector of coefficients B and a covariance matrix S for the coefficients.
Let LLT be the Cholesky Decomposition of S and create the revised version of the coefficient vector
where V = [vi] is a (k+1) × 1 column vector where each vi is an independent random value from the standard normal distribution; i.e. each vi = NORM.S.INV(RAND()).
Each missing data value yi, calculate the probability pi that yi = 1 by
Now impute the value of yi as follows
where u is a random value from the uniform distribution between 0 and 1; i.e.
u = RAND()
This case has not yet been implemented in the Real Statistics Resource Pack and so we won’t discuss it further now. In fact, for now, all categorical variables will be implemented as if they were continuous but using the appropriate constraint. E.g. a binary categorical variable uses the constraint: min = 0, max = 1, round off = TRUE.
Reference
Mitani, A. A. (2013) Multiple imputation in practice: aproaches for handing categorical and interaction variables
https://ayamitani.github.io/files/mitani_qsuseminar_v2.pdf
Carpenter, J., Kenward, M. (2013) Multiple imputation and its application. Wiley
https://books.google.it/books?id=mZMlnTenpx4C&pg=PA122&lpg=PA122&dq=FCS+for+Binary+Categorical+Data&source=bl&ots=bi0hI0j6Ic&sig=ACfU3U2Ze4eT39ZfIEqyL-jOlSOEky0YHQ&hl=en&sa=X&ved=2ahUKEwiv7_L606v8AhX7RvEDHT8UAX84FBDoAXoECBEQAw#v=onepage&q=FCS%20for%20Binary%20Categorical%20Data&f=false