Introduction
The Poisson Regression model assumes that the observed count data follows a Poisson distribution. One problem with this is that seldom does the assumption that the mean = variance hold. Often for this is because the data may contain a lot more zeros than is consistent with a Poisson distribution. We address this situation here.
Basic Concepts
Let’s suppose that π = the proportion of data elements that exceed the number of zeros predicted by a Poisson distribution with mean μ. Then
The mean and variance of this mixture model are
Thus, for data element i, we have
and for j > 0, we have
where
based on a mixture of Poisson regression and logistic regression models with coefficients β1, …, βk and γ1, …, γm, respectively. The regressors xij for the Poisson model don’t have to be the same as those for the logistic model (represented by zij).
Log-likelihood
We first note that
Since the likelihood function for a sample is
using the above equalities, it follows that the log-likelihood is
Topics
We use Solver to find the Poisson and logistic coefficients that maximize LL. Click on the following topics for more details.
- Constructing a ZIP regression model
- ZIP Regression predictions
- Vuong’s test
- Data analysis tool options
References
Hilbe, J. M. (2014) Modeling count data. Cambridge University Press
https://assets.cambridge.org/97811070/28333/frontmatter/9781107028333_frontmatter.pdf
Hintze, J. L. (2007) Zero-Inflated Poisson regression. NCSS
https://www.ncss.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Zero-Inflated_Poisson_Regression.pdf
Long, J. S. and Freese, J. (2001) Regression models for categorical dependent variables using Stata
http://investigadores.cide.edu/aparicio/data/refs/Long%26Freese_RegModelsUsingStata_2001.pdf