Bayesian Mann-Whitney Test

Objective

Describes the Bayesian version of the non-parametric Mann-Whitney Test. This test is typically used when the assumptions for the two independent sample t-test is not met.

Basic Concepts

Suppose that you have two independent samples X = x1, …, xm and Y = y1, …, yn, and want to test the hypotheses

H0: μX > μY

H1: μX ≤ μY

We define

U_X formula

U_Y formula

We next define the population parameters

theta_X and theta_Y

Since θY = 1 – θX, we only need to consider θX. If X is stochastically dominant then θX > .5, while if θX < .5 then Y is stochastically dominant.

Small Sample Approach

The test consists of performing Monte Carlo sampling for each of 200 discrete values of θX ranging from .025 to.9975 in increments of .005. We will call these θi where θi = .0025 + .005(i–1) for i = 1 to 200.

For each of the 200 values of θi, we create a large number of samples (default iter = 30000) of size m from X and of size n from Y. This is done using a little trick. Each Y sample consists of n random values from the exponential distribution f(y) = kexp(-ky) with k = 1, while each X sample consists of m random values from the exponential distribution with k = (1 – θi)/θi. For each pair of samples, we calculate the value of UX and see whether it matches the UX value of the original sample. The proportion of matches is an estimate of the likelihood function P(UX|θi).

The prior for θi assumes equal likelihood, namely P(θi) = 1/200.

Using Bayes Theorem, it follows that

P(phi_i | U_X)

For any θi  value, we draw iter many pairs of samples from distributions with this θX value. We then count what proportion of the sample pairs have the same UX values as our original sample. This proportion is the likelihood P(UX|θX).

Large Sample Approximation

Let h = harmonic mean of m and n = 2mn/(m+n). If h > 20 then we can use the following beta distribution approximation.

Define the beta distribution parameters as

a = θ[h(1.028 + .75u) + 2)]

b = (1–θ)[h(1.028 + .75u) + 2)]

where

Formula for u

Here θ is a large sample approximation for θX. Obviously, θ = a/(a+b), as expected for the mean of a beta distribution.

We now show how to estimate θ. First, we define x

Formula for x

and X = (1, x, x2, x3, x4, x5). Note that x ≥ .5.

Next, define Y = (.5, y1, y2, y3, y4, y5) where

y_5 formula

For i = 1, 2, 3, 4 define

y_i formula

where

w_i formula

with V = (4.813, 2.520, 2.111, 1.833).

We now define the 6 × 6 matrix L

L matrix

Then our estimate for y (and so θ) is equal to the following matrix multiplication value

Formula for y

Finally, our estimate θ of θX is

Final formula for theta

The above estimates for the posterior beta parameters a and b are based on a uniform prior. For a beta prior Bet(a0, b0), the posterior beta distribution is Bet(a+a0–1, b+b0–1) where a and b are as described above.

Examples and Excel Support

Click here for examples of how to carry out the Bayesian Mann-Whitney Test in Excel. Also includes a description of various worksheet functions in support of the Bayesian Mann-Whitney Test, as well as an Excel-based data analysis tool.

References

Kruschke, J. K. (2015) Doing Bayesian data analysis. 2nd Ed. Elsevier
https://sites.google.com/site/doingbayesiandataanalysis/

Chechile, R. A., Barch, D. H. Jr. (2025) Distribution-free Bayesian analyses with the DFBA statistical package
https://link.springer.com/article/10.3758/s13428-025-02605-6

Chechile, R. A. (2019) A Bayesian analysis for the Mann-Whitney statistic
https://doi.org/10.1080/03610926.2018.1549247

Leave a Comment