Runs Test with more than two categories

Basic Concepts

In the testing described in One-Sample Runs Test, there are two categories (e.g. T and H, or 1 and 2, etc.). We now consider the case where there are 3 or more categories. We’ll call these 1, 2, 3, … For any given sequence consisting of k categories, let’s suppose there are nj repetitions of category j and suppose the length of the sequence is n = n1 + … + nk. If the null hypothesis is true that the sequence is random, then for n big enough the number of runs r follows a normal distribution with

Runs test mean

Runs test variancewhere

Example

Example 1: Determine whether the following sequence is random.

ABBCCBAACCACBBAABBBC

We see that there are 12 runs: A BB CC B AA CC A C BB AA BBB C

Since the p-value for the two-tailed test is bigger than the significance level α = .05, we conclude there isn’t enough evidence to say that the sequence is not random. The analysis is shown in Figure 1.

Expanded runs test

Figure 1 – Runs Test

Note that to obtain the values in column B, we can insert the formula =LEN(I$8)-LEN(SUBSTITUTE(I$8,A3,””)) in cell B3, highlight range B3:B5, and press Ctrl-D

Worksheet Functions

Real Statistics Functions: The Real Statistics Resource Pack provides the following worksheet functions.

RUNS_TEST(s, tails) = the p-value of the above runs test for the sequence defined in the string s.

RUNS_TEST(R1, tails) = the p-value of the above runs test for the sequence defined by the column or row array or range R1.

RUNS(s) = the number of runs in string s

RUNS(R1) = the number of runs in the row or column array or range R1.

tails = 1 or 2 (default).

To obtain the runs count in cell C2 of Figure 1, we can use the formula =RUNS(I8) and to obtain the p-value in cell G6, we could use the formula RUNS_TEST(I8).

Observation

The p-value from the runs test described on this webpage will match those for the traditional runs test with only two categories described in One-Sample Runs Test.

For example, the p-value obtained for Example 1 in One-Sample Runs Test is .795676, which can be obtained using the RUNSTEST function. Using the RUNS_TEST function described on this webpage we obtain the same p-value.

Reference

Sheskin, D. J. (2000) Handbook of parametric and nonparametric statistical procedures 2nd Ed. Chapman & Hall/CRC
https://psycnet.apa.org/record/2000-08524-000

2 thoughts on “Runs Test with more than two categories”

    • Hello Cameron,
      Yes, the p-value is approximate. I have not tried to produce an exact p-value. I assume that the approach is similar to that for two categories, but there will be more possibilities.
      Charles

      Reply

Leave a Comment