Vuong’s Test

Objective

The objective of Vuong’s test is to determine whether there is an excess of zeros that makes the results from ZIP regression significantly different from ordinary Poisson regression. Because the ZIP regression and ordinary Poisson regression model are not nested, we can’t use the approach that we use, for example, to compare Poisson and Negative Binomial regression models. Instead, we use Vuong’s test.

Basic Concepts

To perform this test, we first calculate

m_i formula

where the pi = P(yi|Xi) are based on the Poisson regression model and the p* versions are based on the ZIP regression model. The test statistic is

z-stat Vuong test

where m-bar and s are the mean and standard deviation of the mi. z is asymptotically standard normally distributed. If z > zcrit, then ZIP regression is significantly better. If z < –crit then Poisson is better. Otherwise, there isn’t a significant difference. Here zcrit = NORM.S.INV(1-α/2).

Alternatively, you can test whether the p-value < α/2. If so, there is a significant difference with ZIP being better if z > 0 and Poisson better if z < 0.

Example

We conduct this test for Example 1 of Constructing a ZIP Regression Model as shown in Figures 1 and 2.

In Figure 1, we calculate the mi values (only the first 13 of 250 data rows are shown). Here, column A contains the y values (count) from column A in Figure 1 of Constructing a ZIP Regression Model. Column C consists of the mu values for the Poisson regression model corresponding to that in Figure 2 of Constructing a ZIP Regression Model. Columns D and E are copied from columns AC and AE in Figure 6 of Constructing a ZIP Regression Model.

Columns G and H contain the p* and p values. E.g. cell G2 contains the formula =POISSON(A2,C2,FALSE) and cell H2 contains the formula

=(1-E2)*POISSON(A2,D2,FALSE)+IF(A2=0,E2,0)

Finally, column J contains the mi values where cell J2 contains the formula =LN(H2/G2).

Vuong test part 1

Figure 1 – Vuong test (part 1)

Using the results in column J, we obtain the test results shown in Figure 2.

Vuong test part 2

Figure 2 – Vuong test (part 2)

We see from Figure 2 that the ZIP model is significantly better than the Poisson model.

Worksheet Function

Real Statistics Function: The Real Statistics Resource Pack provides the following worksheet function.

VuongTest(Ry, R1, R2, R3, lab, alpha): returns an array with the z-stat, p-value and test result where Ry contains the observed y values, R1 contains the corresponding mu values for the Poisson regression model, and R2 and R3 contain the corresponding mu and pi values for the ZIP regression model. If lab = TRUE (default FALSE) a column of labels is appended to the output. alpha defaults to .05.

For the above example the formula

=VuongTest(A2:A251,C2:C251,D2:D251,E2:E251,TRUE)

returns the output shown in range L6:M8 of Figure 2.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

References

Hilbe, J. M. (2014) Modeling count data. Cambridge University Press
https://assets.cambridge.org/97811070/28333/frontmatter/9781107028333_frontmatter.pdf

Long, J. S. and Freese, J. (2001) Regression models for categorical dependent variables using Stata
http://investigadores.cide.edu/aparicio/data/refs/Long%26Freese_RegModelsUsingStata_2001.pdf

Leave a Comment