Overview
Negative binomial regression is another approach to building regression models for count data. It has a number of advantages over Poisson regression, especially when there is over-dispersion.
The pdf for the negative binomial distribution is
Let α = 1/ν and p = ν/(ν+μ). Since the mean of a negative binomial distribution is μ = ν (1–p)/p, we can express the variance as ν(1–p)/p2 = μ/p = μ(ν+μ)/ν = μ(1+μ/ν) = μ(1+ αμ); i.e.
var = μ(1 + αμ) = μ + αμ2
From the above, we see that the lower the value of α, the lower the variance. In fact, when α is near zero (i.e. ν is large), then we get the assumption for the Poisson distribution, namely that the mean and the variance are equal.
The resulting regression model is called the negative binomial 2 model (aka the mean dispersion negative binomial model).
There is another negative binomial regression model that is based on the following assumption for the variance
var = μ(1 + α) = μ + αμ
This is called the negative binomial 1 model (aka the constant dispersion negative binomial model), and often this provides a better fit for the data. This is created by setting ν = μ/α instead of ν = 1/α.
Over-dispersion
When the variance is larger than the mean we have over-dispersion. The Poisson model assumes that we don’t have any over-dispersion. Some possible reasons for over-dispersion are:
- Missing predictors (i.e. independent variables)
- Incorrect model
- Outliers
On this website we explore the following regression models that can reduce over-dispersion compared to a Poisson regression model:
- Negative binomial model: as described on this webpage
- Zero-truncated model: start counting at 1 and not 0. E.g. # of days in the hospital can’t be 0.
- Zero-inflated and hurdle models: handle excess zeros using logistic regression
In addition to censoring at zero, we can also left censor more generally (e.g. no count less than 3) and right censor (e.g. any count over 8 is counted as 8). We won’t explore these further, however.
Negative Binomial Regression Topics
We discuss the following topics about negative binomial regression:
- Creating a Negative Binomial Regression model using Solver
- Real Statistics Data Analysis Tool (Solver option)
- Real Statistics Data Analysis Tool (Newton’s method option)
- Predictions
- Comparisons with a Poisson regression model
- Additional Insights
References
Hilbe, J. M. (2014) Modeling count data. Cambridge University Press
https://www.cambridge.org/core/books/modeling-count-data/BFEB3985905CA70523D9F98DA8E64D08
Hintze, J. L. (2007) Negative binomial regression. NCSS
https://www.ncss.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Negative_Binomial_Regression.pdf