Zero-Inflated Poisson (ZIP) Reg| Real Statistics Using Excel

Introduction

The Poisson Regression model assumes that the observed count data follows a Poisson distribution. One problem with this is that seldom does the assumption that the mean = variance hold. Often for this is because the data may contain a lot more zeros than is consistent with a Poisson distribution. We address this situation here.

Basic Concepts

Let’s suppose that π = the proportion of data elements that exceed the number of zeros predicted by a Poisson distribution with mean μ. Then

The mean and variance of this mixture model are

Thus, for data element i, we have

and for j > 0, we have

where

based on a mixture of Poisson regression and logistic regression models with coefficients β₁, …, β_k and γ₁, …, γ_m, respectively. The regressors x_ij for the Poisson model don’t have to be the same as those for the logistic model (represented by z_ij).