Correspondence analysis plays a role similar to factor analysis or principal component analysis for categorical data expressed as a contingency table (e.g. as described in the chi-square test of independence).
Essentially, correspondence analysis decomposes the chi-square statistic of independence into orthogonal factors. This approach is valid even when the cell sizes in the contingency table are less than 5 (or even zero).
Topics
References
Rencher, A.C., Christensen, W. F. (2012) Methods of multivariate analysis (3nd Ed). Wiley
Johnson, R. A. and Wichern, D. W. (2007) Applied multivariate statistical analysis. 6th Ed. Pearson.
https://www.webpages.uidaho.edu/~stevel/519/Applied%20Multivariate%20Statistical%20Analysis%20by%20Johnson%20and%20Wichern.pdf
Can a multiple correspondence analysis be performed?
If anyone knows please tell me how to do it.
Hello Alan,
I see by googling that there is information online for this topic. See, for example
https://en.wikipedia.org/wiki/Multiple_correspondence_analysis
Real Statistics does not support this topic at present.
Charles
Hi Charles,
You are doing a phenomenal work as I have been reading through your content and replies on website for my project. Thank you for such a quality content on Regression and Statistics on Excel. I want to perform Logistics Regression on employee attrition dataset where Y is Attrition Status of the employee (Yes/No) and X are Age, Department, Job Role, Monthly Income, Marital Status, etc. Overall I have 44 independent variables like these (30 before one-hot encoding). I have been trying to find ways to implement FAMD (Factor Analysis of Mixed Data) to decrease the number of independent variables before performing the regression. I am particularly looking for FAMD for this purpose because my independent variables are of mixed type i.e., 14/30 features are numerical features, say, Age, Monthly Income, Distance From Home, Percent Salary Hike, etc., while 16/30 features are categorical, say, Department (HR, Sales, R&D), Gender (M/F), Job Role (Sales Representative, Sales Executive, Manager, Laboratory Technician, etc.), Job Level (1,2,3,4), and so on. Is there a way to implement FAMD using Real Statistics Add-in in Excel? And how to interpret the data derived from running the FAMD command, if any, as well as how to use the reduced dimensions for Logistic Regression model implementation?
To reduce the dimensions, you could use Factor Analysis. See
https://www.real-statistics.com/multivariate-statistics/factor-analysis/
Regarding the coding of categorical data, see
https://www.real-statistics.com/real-statistics-environment/data-conversion/coding-categorical-variables/
Charles
Dr thank you very much. Merry Christmas and Happy New year.
Doc, Thankyou very much
Doctor Zainontz, buenos días, muchas gracias, por tan grande beneficio que nos presta a la comunidad de investigadores, con su página. Implementaría Ud. el Análisis de correspondencia múltiple, en una versión siguiente? y el DOE 3^k?
Dr. Zainontz, good morning, thank you very much, for the great benefit you give us to the research community, with your page. Would you implement the Multiple Correspondence Analysis in a subsequent version? and the DOE 3 ^ k?
Gerardo,
Thank you for your input. I will add these to the list of potential future enhancements.
Charles