Overview
The multinomial distribution is a generalization of the binomial distribution to two or more events.
Definition 1: For an experiment with the following characteristics:
- the experiment consists of n independent trials
- each trial has k mutually exclusive outcomes Ei
- for each trial the probability of outcome Ei is pi
let x1 …, xk be discrete random variables whose values are the number of times outcome Ei occurs in n trials. Then x1 …, xk has a multinomial distribution. The (joint) probability distribution function (pdf) is defined as follows:
The case where k = 2 is equivalent to the binomial distribution.
Key properties of the multinomial distribution are
- E[xi] = npi
- var(xi) = npi(1–pi)
- cov(xi, xj) = –npipj for i ≠j
Example
Example 1: Suppose that a bag contains 8 balls: 3 red, 1 green, and 4 blue. You reach in the bag pull out a ball at random and then put the ball back in the bag and pull out another ball. This experiment is repeated a total of 10 times. What is the probability that the outcome will result in exactly 4 reds and 6 blues?
The possible outcomes for each trial in this experiment are E1Â = a red ball is drawn, E2Â = a green ball is drawn and E3Â = a blue ball is drawn. Thus p1Â = 3/8, p2 = 1/8, p3Â = 4/8, x1Â = 4, x2Â = 0 and x3 = 6, and so
Worksheet Functions
Excel Function: While Excel does not provide a function for the multinomial distribution, it does provide the following function:
MULTINOMIAL(x1 …, xk) = n! / (x1!∙…∙xk!)
Thus we could also calculate the answer to Example 9.10 by using the formula
MULTINOMIAL(4,0,6)*(3/8)^4*(1/8)^0*(4/8)^6 = .064888
We can also use a range as the argument of MULTINOMIAL as shown in Figure 1.
Figure 1 – Multinomial distribution
We can use the following Excel array formula to calculate the same result
=PRODUCT(MULTINOMIAL(B3:B5),B6:B8^B3:B5)
Alternatively, we can use the following more complicated non-array formula
=MULTINOMIAL(B3:B5)*EXP(SUMPRODUCT(B3:B5,LN(B6:B8)))
Real Statistics Function: The following function in the Real Statistics Resource Pack can be used to calculate the multinomial distribution.
MULTINOMDIST(R1, R2) = the value of the pdf of the multinomial distribution where R1 is a row or column array containing the values x1, …, xk and R2 is a row or column array containing the values p1, …, pk, where R1 and R2 have the same shape.
Referring to Figure 1, we have MULTINOMDIST(B3:B5,B6:B8) = 0.064888.
Examples Workbook
Click here to download the Excel workbook with the examples described on this webpage.
Reference
Wikipedia (2021) Multinomial distribution
https://en.wikipedia.org/wiki/Multinomial_distribution
Chen, Y-C. (2020) Multinomial distribution
http://faculty.washington.edu/yenchic/20A_stat512/Lec7_Multinomial.pdf
HI Charles.
Is there a way to calculate the inverse of multinomial distribution?
I’m thinking about a set of success values (s), which is a vector of different states of successes (si), each one with a probability of success, (pi). Need to calculate the total of failures before getting the number of successes desired.
An example (taken from Vose software) will provide clearer understanding:
You want to make sure that at the end of the week you have called 50 people who never heard of your product, 50 people who don’t have internet at home and 200 people who use internet almost daily.
With probabilities of success pi, and states, you can tell the number of failures you’ll have before you’ve called all the people you wanted, so you know the total number of phone calls.
In brief, Vose get the total number with a function:
The total number of phone calls = the total number of successes (300) + the total number of failures (NegMultinomial({50,50,200},{p1,p2,p3})).
How can the total number be calculated in Excel without Vose function?
Hello Johann,
Are you looking for the negative multinomial distribution as described on the following web page?
https://en.wikipedia.org/wiki/Negative_multinomial_distribution
Are you looking for what Vose calls type 1 or type 2?
Charles
Hello Charles, the wikipedia reference is correct, I want to know how to calculate total number of calls, result that Vose Type 1 would provide. I have not found a worked example.
Would be great to get the complimentary result of Vose Type 2 too, I guess need first the result of type 1, and then to get the number of failures in each state is straightforward.
Hello Johann,
Seems like an interesting concept. I will add it to my list of potential future enhancements.
Charles
Hi Charles, think I found an example of calculation for expectation of negative multinomial distribution. A fair 3 sided dice, and the number of expected rolls to get a specific result; however the question was answered using Markov chains.
https://math.stackexchange.com/questions/3307839/expectation-of-negative-multinomial-distribution
Hi Johann,
Thanks for sharing this. I will look into it.
Charles
Thank you for the tools very much!
Hello, thank you for the tools very much.
I use this function MULTINOMDIST(R1, R2) to get the pdf, then get the cdf by adding it up.
but the cdf does not approach 1 (in fact it approach 1.17).
I’m confused for this situation.
The parameters I used are:
(x1,x2,x3,x4,x5)=(1,1,1,1,n-4)
(p1,p2,p3,p4,p5)=(0.02,0.02,0.02,0.02,0.92)
for example:
when n=4, MULTINOMDIST(Rx, Rp)=0.00000384 ; cdf=0.00000384
when n=5, MULTINOMDIST(Rx, Rp)=0.000017664 ; cdf=0.000021504
….
when n=88, MULTINOMDIST(Rx, Rp)=0.007833858 ; cdf=0.994673755
when n=89, MULTINOMDIST(Rx, Rp)=0.007833858 ; cdf=1.002507613
…
when n=170, MULTINOMDIST(Rx, Rp)=0.000125678 ; cdf=1.169895198
Excuse me, could you help me with this question.
Thank you very very much, again.
There was a bug in the MULTINOMDIST function which was corrected yesterday.
I suggest that you download and install the new version.
Charles
sorry, i type wrong.
The parameters I used are:
(p1,p2,p3,p4,p5)=(0.02,0.02,0.02,0.02,0.92) <-correct (=in fact to use.)
So. i get the probability from the stated calculations… how do i get the confidence intervals for each proportion?Is it somewhere on the real statistics site?
The site contains lots of examples of how to calculate confidence interval. Please be more specific about the confidence interval that you are looking for. Is it for a proportion distribution? If so, see the webpage Proportion Distribution.
Charles
Hi Charles,
I found the answer to my question already. No need to spend time on it.
Thanks
Hi,
I’m interested in the answer.
What would be the formula for Confidence Interval in this case ?
Thanks
Hi,
If you have a proportional multinomial distribution with probabilities p1, p2, …, pk for mutually exclusive events E1, E2, …, Ek. Then for any pi you can look at this as a binomial distribution with p = pi. 1-p is therefore the sum of the pj’s excluding pi. The confidence interval for the population version of pi is therefore as calculated for the proportional binomial distribution. This is described on the following webpage using the normal distribution approximation.
https://real-statistics.com/binomial-and-related-distributions/proportion-distribution/
Later today I plan to add a slightly more accurate version of the confidence interval due to Edwin Wilson, which can be used for both the binomial and multionomial distributions.
Charles
Charles,
I need to sort the weights of a large sample of people within different intervals (110kg), from which I calculate proportions.
Then I compute the confidence intervals.
So I think that I can use the CI of the proportional binomial distribution. Right?
Thanks
(110kg)
less than 50;50-69;70-89;90-109; more than 110kg
Sorry, but I don’t have enough information to provide a response.
Charles
Hi Charles,
I have a question that relates to a multinomial distribution (not even 100% sure about this) that I hope you can help me with.
If I take a sample (lets assume n=400) on a categorical variable that has more than two possible outcomes (e.g. blue, black, green, yellow) and plot the frequencies so that I can get the probabilities. E.g.:
black 10%
blue 25%
green 35%
yellow 30%
How could I compute the 95% confidence interval for those probabilities?
Any help is highly appreciated.
Thanks a lot again for creating such a great resource.
Kind regards,
Dirk
Hi,
an alternative method to calculate the probability would be an array formula:
=PRODUCT(B9,B6:B8^B3:B5) + CTRL+SHIFT+ENTER
Hi Dirk,
Yes. I like your approach, especially since it is simpler than the one I wrote on the webpage. I have just added your suggestion to the referenced webpage. Thanks for your very useful comment.
Charles