Chi-square Post-hoc Functions

After a significant result from the Chi-square independence test, as described at Independence Testing Follow-up, you can perform follow-up tests to pinpoint the source of the significant result. The approaches used are somewhat similar to the use of contrasts after ANOVA.

Post-hoc worksheet functions

The following Real Statistics array function is available to assist in post-hoc testing.

POST_CHISQ(R1, srow, scol, lab): returns a column array with the chi-square statistic, p-value and Cramer’s V for the post-hoc test on the contingency table in R1 based on the arguments srow and scol. If lab = TRUE (default FALSE) then a column of labels is appended to the output.

Here, R1 defines a contingency table without headings or totals and srows and scols describe which rows and columns to delete from R1 or combine. E.g. the array formula =POST_CHISQ(A1:C4, “1,2”,”-3″) performs a chi-square test of independence on the 4 × 3 contingency table in A1:C4 with the third column dropped and the first two rows merged (resulting in a 3 × 2 contingency table).

The array formula =POST_CHISQ(A1:C4, “-1,3,4”) performs a chi-square test on the contingency table in A1:C4 with the first row dropped and the last two rows merged (resulting in a 2 × 3 contingency table).

POST_CHIMAX(R1, srow, scol, lab): performs the same test as POST_CHISQ except that now the maximum likelihood version of the chi-square test is used instead of the usual chi-square test.

COMPACT_TABLE(R1, srow, scol, head): outputs an array containing the contingency table that results from the contingency table in R1 based on srow and scol, as described above. If head = TRUE (default) then it is assumed that R1 includes row and column headings and these are also used in the output.

Standard and adjusted residuals functions

Another approach to post-hoc testing is to determine which cells are playing the biggest and smallest role in the independence test. This is done by calculating the standard residuals of each cell (similar to a z-score). Cells that have a standard residual whose absolute value is larger than 1.96 can be viewed as significant (for alpha = .05). A similar, but more refined approach uses adjusted residuals.

Real Statistics functions that support this approach are as follows:

StdRes(R1, head): returns an array of the same size and shape as the contingency table in R1; each cell in the output contains the standard residual for the corresponding cell in R1; If head = TRUE (default FALSE), then both R1 and the output contain row and column headings.

AdjRes(R1, head): just like StdRes except that the adjusted residuals are returned instead of the standard residuals.

StdResTest(R1, head): just like StdRes except that the p-values for the standard residuals are returned instead of the standard residuals.

AdjResTest(R1, head): just like AdjRes except that the p-values for the adjusted residuals are returned instead of the adjusted residuals.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

References

Newsom, J. T. (2000) Lecture 13, More on chi-square
http://web.pdx.edu/~newsomj/pa551/lectur13.htm

Agresti, A. (2013) Categorical data analysis, 3rd Ed. Wiley.
https://mybiostats.files.wordpress.com/2015/03/3rd-ed-alan_agresti_categorical_data_analysis.pdf

Leave a Comment