Introduction
We describe the use of worksheet functions to make predictions based on the Zero-Inflated Poisson (ZIP) regression models described in Creating ZIP Regression Models.
Predictions
The Zero Inflated Poisson Regression data analysis tool provides the predictions for the X values in the original data, as shown on the right side of Figure 5 in Creating ZIP Regression Models.
In general, even for data not in the original sample, you can use the ZIPPredC and ZIPPredCC functions to predict the expected count for any combination of regressors, as described below. You can also use the ZIPProb function, as explained below.
For Example 1 of Creating ZIP Regression Models, we can use the ZIP regression coefficients (repeated in range F1:H5 of Figure 1) to make predictions as shown in Figure 1.
Figure 1 – ZIP regression forecasts
Using the array formula =ZIPPredCC(J2:L5,G2:H5,TRUE) we obtain the forecasts for the four profiles shown in range J2:L5. Note that the first two agree with the forecasts shown for the first two rows in Figure 6 of Creating ZIP Regression Models. The last two rows are for profiles not represented in the data used to construct the regression model. In the above, note that pred = mu*(1–pi).
We can also obtain the results shown in range N2:N5 by using the array formula =ZIPPredC(J2:L5,G2:H5).
Probabilities
Determine the probability of someone with the profile in range J2:L2 of Figure 1 catching 0, 1, 2, …, 6 fish. Do the same for the profile in range J3:L3 of Figure 1.
The probability of catching zero fish is
π + (1-π)p(0) where p(0) = POISSON(0, μ, FALSE) = EXP(-μ)
For the profile in range J2:L2, this is
π + (1-π)*EXP(-μ) = .677172 +.322828*EXP(-1.031288) = .79225
The probability of catching x fish for x > 0 is simply
(1-π)p(x) where p(x) = POISSON(x, μ, FALSE)
Thus, the probability of that someone with the profile in range J2:L2 catching 1 fish is
.322828*POISSON(1,1.031288,FALSE) = .118705
The other probabilities are calculated in a similar way. The full results are shown in Figure 2.
Figure 2 – Probabilities
Here range N2:N15 is filled in using the array formula =ZIPProb(I2:K15,F2:G5,L2:L15) as described below.
Worksheet Functions
Real Statistics Functions: The Real Statistics Resource Pack provides the following worksheet functions. These pertain to a ZIP regression model based on the coefficients in Rc, and the X data in Rx (with k columns), Y count data in Ry, and frequency data in Rt. If Rt is omitted it defaults to a column of ones. Rc is a k+1 × 2 array whose first column specifies the Poisson regression coefficients and whose second column specifies the logistic regression coefficients.
ZIPPredC(Rx, Rc, Rt): returns a column array with the forecasts for the profiles in Rx/Rt
ZIPPredCC(Rx, Rc, lab, Rt, alpha): returns an array with 6 columns where each row contains the forecast values for the corresponding profile from Rx/Rt along with the mu, pi, s.e. and the lower and upper ends of the 1-alpha confidence interval for the forecast. If lab = TRUE (default FALSE) then an extra row of headings is appended to the output. alpha defaults to .05. If Rt is omitted it defaults to a column of ones.
ZIPProb(Rx, Rc, Ry, Rt): returns a column array with the probabilities for the profiles in Rx/Rt and counts in Ry. If Ry is omitted then it defaults to a column of zeros. You can also set Ry to a single value, in which case all the counts are set to this value.
Examples Workbook
Click here to download the Excel workbook with the examples described on this webpage.
References
Hilbe, J. M. (2014) Modeling count data. Cambridge University Press
https://assets.cambridge.org/97811070/28333/frontmatter/9781107028333_frontmatter.pdf
Hintze, J. L. (2007) Zero-Inflated Poisson regression. NCSS
https://www.ncss.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Zero-Inflated_Poisson_Regression.pdf
Long, J. S. and Freese, J. (2001) Regression models for categorical dependent variables using Stata
http://investigadores.cide.edu/aparicio/data/refs/Long%26Freese_RegModelsUsingStata_2001.pdf