Basic Concepts
Suppose there are n1 elements of one type (say 1) and n2 of the other type (say 2) and let x be the random variable equal to the number of runs. We now describe the (discrete) distribution function of x.
First, note that P(x = r) = the # of permutations that result in r runs divided by the # of permutations of Β 1βs and Β 2βs. The denominator is the number of ways of placing n1 1βs in the n1+n2 positions, i.e. C(n1+n2, n1). The formula for the numerator is more complicated.
We ignore the case where r = 1 since this only occurs if n1 = 0 or n2 = 0.
Even Case
Suppose that r is even. Thus r = 2m for some positive integer m. Ignoring the repetitions of the 2βs for a moment and assuming that the first number in the sequence is a 1 we count the number of sequences with r runs. Since 1 is in the first position 2 must be in the last position (for r to be even), and so the problem is equivalent to counting the number of ways of placing exactly one 2 after mβ1 of the n1β1Β 1βs. This is C(n1β1,Β mβ1).
E.g. if n1 = 7 and n2 = 4 then runs of length 8 can be achieved as follows
12111211212, 11121212112, etc.
If instead, the first number in the sequence is a 2 then the last number is a 1, and so the problem is equivalent to counting the number of ways of placing exactly one 2 before m-1 of the n1-1 1βs, which is again C(n1β1,Β mβ1). Thus if r is even we have 2β C(n1β1,Β mβ1)Β possibilities.
We next need to account for the fact that there can be multiple 2βs in sequence. The same logic shows that there are C(n2β1,Β mβ1) ways of distributing the 2βs (treating the sequences of 1βs as the spacers between sequences of 2βs).
Thus if r = 2mΒ there are 2β C(n1β1,Β mβ1) β C(n2β1,Β mβ1) ways of obtaining r runs. Thus
Odd number of runs
If r is odd then r = 2m+1 for some positive integer m. The approach is the same, except that this time sequences must begin and end with the same number. Suppose that the sequence begins (and ends) with a 1.Β Since we must insert one extra 2 (compared to the case where r is even), we have C(n1β1,Β m)Β possible sequences where a single 2 acts as a spacer between sequences of 1βs.
Once again there are C(n2β1,Β mβ1)Β ways of distributing the 2βs and so there are C(n1β1,Β m) β Β C(n2β1,Β mβ1) ways of obtaining r runs if the first in the sequence is a 1. Similarly, if the first in the sequence is a 2 then there are C(n2β1,Β m) β Β C(n1β1,Β mβ1)Β ways of obtaining r runs. Thus
Note that if n1 β₯ n2 then the maximum value of r is 2n2+1. Note that the above formula works fine as long as Β C(n2β1,Β m) = C(mβ1 m) = 0. Unfortunately, the Excel formula =COMBIN(a,b) returns an error if a < b.
Example
Example 1: Is 22211121121221111 a random sequence?
Here n1Β = 10, n2Β = 7 and r = 8. Using the above formulas we can create the worksheet shown in Figure 1.
Figure 1 β Runs Test
The maximum number of runs is 2n2+1 = 2(7)+1 = 15. Some sample formulas are shown in Figure 2.
Cells | Item | Formula |
B18 | r = 13 (odd) | =COMBIN(B$3-1,QUOTIENT(A18,2))*COMBIN(B$4-1,QUOTIENT(A18,2)-1)+COMBIN(B$4-1,QUOTIENT(A18,2))*COMBIN(B$3-1,QUOTIENT(A18,2)-1) |
B19 | r = 14 (even) | =2*COMBIN(B$3-1,QUOTIENT(A19,2)-1)*COMBIN(B$4-1,QUOTIENT(A19,2)-1) |
B20 | r = 15 (last) | =COMBIN(B$3-1,QUOTIENT(A20,2)) |
B21 | max | =SUM(B7:B20) or =COMBIN(B3+B4,B3) |
C18 | P(r = 13) | =B18/B21 |
D18 | cdf at r = 13 | =C18+D17 |
Figure 2 β Sample formulas from Figure 1
If we perform a two tailed test with Ξ± β .05, from Figure 1 we see that P(rΒ β€ 5) = .0245 and P(r β₯ 13) = 1 – .990 = .010, but P(rΒ β€ 6) = .080 and P(r β₯ 12) = 1 – .957 = .043.Β Thus, we canβt reject the null hypothesis that the sequence of values is random for any values of rΒ in the interval [6, 12], while we reject the null hypothesis for values of rΒ outside this interval. For Example 1, r = 8, which is inside this interval, and so we cannot reject the null hypothesis that the sequence is random.
Worksheet Functions
Real Statistics Functions: The following functions are provided in the Real Statistics Pack:
RUNSDIST(r, n1, n2, cum) = the probability density function value at r of the runs distribution (i.e. the probability that there are r runs in a sequence of n1 of one type and n2 of the other type) when cum = FALSE and the corresponding cumulative probability distribution value at r (i.e. the probability there are at most r runs) when cum = TRUE.
RUNSINV(p, n1, n2) = the critical value; i.e. the smallest value of r such that RUNSDIST(r, n1, n2,Β TRUE)Β β₯ p.
Observations
For the table lookup functions RLowerCRIT and RUpperCRIT, defined in Single Sample Runs Test, the following equalities hold.
RLowerCRIT(n1,Β n2,Β Ξ±, 1) = RUNSINV(Ξ±, n1,Β n2)
RUpperCRIT(n1,Β n2,Β Ξ±, 1) = RUNSINV(1βΞ±, n1,Β n2)
RLowerCRIT(n1,Β n2,Β Ξ±, 2) = RUNSINV(Ξ±/2, n1,Β n2)
RUpperCRIT(n1,Β n2,Β Ξ±, 2) = RUNSINV(1βΞ±/2, n1,Β n2)
Examples Workbook
Click hereΒ to download the Excel workbook with the examples described on this webpage.
Reference
Mehta, C. R. and Patel, N. R. (2013) IBM SPSS exact tests
https://www.ibm.com/docs/en/SSLVMB_27.0.0/pdf/en/IBM_SPSS_Exact_Tests.pdf
Hi,
If I have n1 + n2 + n3 (3 elements) is there any way to run the Runs Distribution in excel?
Hello Cameron,
Yes, there is. I will add such a test to the website in the next couple of days.
Charles
Many thanks Charles – I look forward to seeing it π
Hello Charles.
I have one more question. If n1 and n2 are equal (for example n1 = 10, n2 = 10) then maximum value of r is 2*n1 or 2*n2, am I right?
Joe
Hello Joe,
Yes, that is correct.
I just noticed that there is an error in the calculation of the Real Statistics function RUNSDIST(r,n1,n2,cum) when n1 is not equal to n2 and r = 2*n1 + 1 (assuming n1 < n2). This will be corrected in the next release of the Real Statistics software, due out in a few days. Charles
Hello,
this is great site but I have two remarks.
1) If r is odd the correct equation is r = 2m β 1.
2) Figure 1
How did You calculate the value in C20? I am asking cause one part of the equation have to be C(n2-1;m-1) = C(7-1,(15+1)/2-1) = C(6,7) = error so the C20 is error and no number one. Or am I missing something?
Thank you for explanation π
Hello Joe,
1) r = 2m + 1, as stated on the webpage
2) The formula in cell C20 is =B20/B21
Charles
Hello Charles.
Thank You for your fast answer.
I calculated the cells accordingly with publication of “Swed and Eisenhart – Tables for testing randomness of grouping in a sequence of alternatives” and it leads to the same results as on figure 1 accept cell C20 and, of course, D20. The way I used leads to the error in the cells, but It seems like your way leads to the result. Very interesting! π
There is r = 2m β 1 in case of r = odd in the article I mentioned above, so probably, in your way, there should really be r = 2m + 1. π
Thank you for this web! It helps me a lot to understand a statistics. π
Hello Joe,
Glad that I could help. I am using r = 2m + 1 and not r = 2m – 1.
Charles