FCS Binary Categorical Data | Real Statistics Using Excel

We start by creating a logistic regression model for Y based on the data in X using only the complete rows of X and Y (i.e. using listwise deletion). The result is a column vector of coefficients B and a covariance matrix S for the coefficients.

Let LL^T be the Cholesky Decomposition of S and create the revised version of the coefficient vector

where V = [v_i] is a (k+1) × 1 column vector where each v_i is an independent random value from the standard normal distribution; i.e. each v_i = NORM.S.INV(RAND()).

Each missing data value y_i, calculate the probability p_i that y_i = 1 by

Now impute the value of y_i as follows

where u is a random value from the uniform distribution between 0 and 1; i.e.

u = RAND()

This case has not yet been implemented in the Real Statistics Resource Pack and so we won’t discuss it further now. In fact, for now, all categorical variables will be implemented as if they were continuous but using the appropriate constraint. E.g. a binary categorical variable uses the constraint: min = 0, max = 1, round off = TRUE.

Reference

Mitani, A. A. (2013) Multiple imputation in practice: aproaches for handing categorical and interaction variables
https://ayamitani.github.io/files/mitani_qsuseminar_v2.pdf

Carpenter, J., Kenward, M. (2013) Multiple imputation and its application. Wiley
https://books.google.it/books?id=mZMlnTenpx4C&pg=PA122&lpg=PA122&dq=FCS+for+Binary+Categorical+Data&source=bl&ots=bi0hI0j6Ic&sig=ACfU3U2Ze4eT39ZfIEqyL-jOlSOEky0YHQ&hl=en&sa=X&ved=2ahUKEwiv7_L606v8AhX7RvEDHT8UAX84FBDoAXoECBEQAw#v=onepage&q=FCS%20for%20Binary%20Categorical%20Data&f=false

FCS for Binary Categorical Data

Reference

Leave a Comment Cancel reply