Boschloo Exact Test

Objective

We now explore the Boschloo Exact Test, another exact test for 2 × 2 contingency tables. We start by reviewing the Fisher Exact Test for 2 × 2 contingency tables.

Fisher’s Exact Test

A 2 × 2 contingency table takes the following form

2x2 contingency table

Here, we assume that n11 is the smallest of the four shaded cells. In this case, the p-value for the one-sided test is

p-value = HYPGEOM.DIST(n11, n1, m1, n, TRUE)

Note that we can exchange the n1 and m1 values in the above formula and get the same result

To obtain the two-sided test, we need to find the equivalent right side of the test. This is found by first calculating

x = HYPGEOM.DIST(n11, n1, m1, n, FALSE)

Now assume that n1 ≤ m1 (if not, we reverse their roles). We now need to find the smallest value of m such that

HYPGEOM.DIST(n1-m, n1, m1, n, FALSE) > x

Thus, the equivalent right side test occurs at n1-m+1. Assuming the one-tailed p-value is p and

q = 1-HYPGEOM.DIST(n1-m+1, n1, m1, n, TRUE)

Then, the two-sided p-value is

                                                                                     p-value = p + q         

It turns out that Fisher’s exact test is conservative. This means that for any given value of the significant level, alpha, the test might really be based on a higher level of alpha. This means that it is easier to obtain a significant result, thereby reducing the power of the test. Essentially, we need to find the real significance level.

Example

Example 1: A new treatment (Treat 1) is being tested against its predecessor (Treat 2) to determine whether it is more effective in treating cancer, as defined by the patient is alive after 3 years. 24 patients were studied, half selected randomly to receive the new drug and half given the old drug. Before the study began, one patient in the Treat 1 group left the study. The results are summarized in Figure 1:

Example contingency table

Figure 1 – Example contingency table

The null and alternative hypotheses are described as follows:

One-sided test

H0: Treatments and Die/Live are independent

H1: Treat 1 is more effective than Treat 2 (Live is more likely)

Two-sided test

H0: Treatments and Die/Live are independent

H1: Treatments are not independent (i.e. treatments are not equally effective)

In Fisher’s exact test, we essentially consider 10 contingency tables where cell n11 takes values 0, 1, 2, …, 9 with row and column totals fixed. Using the FISHER22 worksheet function, as explained below, we obtain the one-tailed p-value = .060237 or two-tailed p-value = .089379.

Worksheet Functions

In Fisher’s Exact Test, we define the FISHERTEST worksheet function that is used to perform the Fisher’s exact test in Excel. Starting with Rel 9.8, we add the following worksheet functions.

FISHER2x2(R1, tails) = p-value of the Fisher exact test for the 2 × 2 contingency table in R1 (without headings or totals)

FISHER22(n11, n21, m1, m2, tails) = p-value of the Fisher exact test for the 2 × 2 table where n11, n21, m1, and m2 are as defined above.

Here, tails = 1 or 2 (default).

Boschloo’s Exact Test

The key assumption for Fisher’s Exact Test is that the row and column totals are known; i.e. m1, m2, n1, and n2 are known (and therefore n is known). This is not usually the situation. A more common situation occurs when the row totals are known, but the column totals are not known (or vice versa). This is the assumption for Boschloo’s ExactTest.

We model the situation as follows

n11 ∼ Binom(m1, π1)          n21 ∼ Binom(m2, π2)

for some π1, π2 in the interval [0, 1]. The null-hypothesis is that

H0: π1 = π2

Under the null-hypothesis, we assume that π is the common value of π1 and π2. Here, π is unknown.

Step 1: Calculate the p-value f0 of the Fisher Exact test for the original contingency table

Step 2: For each candidate value p of π, perform the Fisher Exact tests replacing n11 by the values i = 0, 1, …, m1 and replacing n21 by j = 0, 1, …, m2. This results in (m1 + 1)(m2 + 1) such tests.

For each (i, j), we first test whether the f ≤ f0 where f is the p-value of the Fisher exact test for the (i, j) contingency table. If so, we then calculate the probability q that the table with (i, j) occurs, assuming independence, namely

=BINOM.DIST(i, m1, p, FALSE) * BINOM.DIST(j, m2, p, FALSE)

We sum up all the qualifying q and divide by 2. This is the potential one-sided Boschloo p-value associated with p.

Test p-value

The p-value of the Boschloo test is the maximum of these potential Boschloo p-values; i.e. the maximum for all values of p in the interval [0, 1].

One approach for doing this is to use 0, .01, .02, …, .98, .99 as the candidate values for p, and take the maximum of the corresponding potential Boschloo p-values.

It turns out that if we always rearrange contingency tables so that the smallest entry is in the upper left-hand corner, then we only need to consider values of p between 0 and .5. Thus, we need to explore p = 0, .01, …, .05 (51 instead of 100).

Example

In Fisher’s exact test for Example 1, we essentially consider 10 contingency tables where cell n11 takes values 0, 1, 2, …, 9 with row and column totals fixed. Using the formula =FISHER22(B2,B3,D2,D3,1), we obtain the one-tailed p-value = .060237, as shown in cell C6 of Figure 2.

Fisher exact test

Figure 2 – Boschloo test, step 1

For this example, only the row totals are fixed. Therefore, we need to consider the 12 × 13 tables where n11 varies from 0 to 11 (m1), and similarly n21 varies from 0 to 12 (m2). We use the Real Statistics formula =SEQ2S(0,D2,0,D3) to obtain the 12 × 13 = 156 pairs of (n1, n2) values, as shown in columns F and G of Figure 3 (only the first 9 and last 14 rows are displayed).

For each such table, we compute the one-sided Fisher’s exact test for that table as shown in column H. This is done by placing the formula =FISHER22(F2,G2,$D$2,$D$3,1) in cell H2, highlighting range H2:H157, and pressing Ctrl-D.

We now obtain the binomial p-values for each of the 156 entries, as shown in column I, where we only include those entries whose Fisher p-value from column H is less than or equal to the Fisher p-value for the original contingency table, as shown in cell C6 of Figure 2. This is accomplished by placing the formula

=IF(H2<=$C$6,BINOM.DIST(F2,$D$2,$C$7,FALSE)*BINOM.DIST(G2,$D$3,$C$7,FALSE),0)

in cell I2, highlighting range I2:I157, and pressing Ctrl-D.

Boschloo test step 2

Figure 3 – Boschloo test, step 2

Result

Finally, we sum the entries in column I and divide by 2 using the formula =SUM(I2:I157)/2 to obtain the proposed one-sided Boschloo p-value = .02274, as shown in cell I158.

We need to do this for values of p in cell C7 from 0 to .5. As explained above, we do this for 51 values of p = 0, .01, .02, …, .49, .50. When we do so, we find that we obtain the highest Boschloo p-value when we use the candidate binomial p = .31. Changing cell C7 to .31 yields the one-side Boschloo p-value of .027531 in cell I158.

The two-sided Boschloo p-value is double this, namely .055061. Another approach is to use the same method as described above, but instead of using the one-sided Fisher’s exact test p-values, we use the two-sided Fisher’s exact test p-values. In this case, we don’t divide the resulting sum by two. For Example 1, this results in p-value = .063075

Worksheet Function

Starting with Rel 9.8, the Real Statistics Resource Pack will provide the following worksheet function

BOSCHLOO(R1,tails) = p-value of the Boschloo exact test for the 2 × 2 contingency table in R1; tails = 1 or 2 (default).

For Example 1, =BOSCHLOO(B2:C3) yields p-value = .055061, while =BOSCHLOO(B2:C3, 1) yields p-value = .027531.

tails can also take the value -2, in which case, we use the alternative approach for calculating the two-sided test, obtaining p-value = .063075.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

Links

↑ Chi-square and F distributions

References

Boschloo, R. D. (1970) Raised conditional level of significance for the 2 × 2 table when testing the equality of two probabilities. Statistica Neerlandica 24. 1-35.
https://ir.cwi.nl/pub/8128/8128D.pdf

Vanhove, J. (2024) Exact significance tests for 2 × 2 tables
https://janhove.github.io/posts/2024-09-10-contingency-p-value/

Lydersen, S., Fagerland, M. W., Laake, P. (2009) Recommended tests for association in 2×2 tables
https://pubmed.ncbi.nlm.nih.gov/19170020/

Metricgate (2026) Fisher-Boschloo exact test
https://metricgate.com/docs/fisher-boschloo-exact-test/

Leave a Comment