Non-parametric test functions | Real Statistics Using Excel

The following is a summary of worksheet functions provided in the Real Statistics Resource Pack that support non-parametric tests.

These functions are organized into the following categories:

Chi-square tests
Other non-parametric tests
Other non-parametric tests (array functions)
Runs tests
Power and sample size for non-parametric tests
Kolmogorov-Smirnov and Lilliefors functions
Anderson-Darling test
Other Goodness-of-Fit tests
Distribution functions for non-parametric tests
Table lookup functions

You can find a list of functions that support non-parametric versions of ANOVA at Real Statistics Regression/ANOVA Functions.

Chi-square tests

CHI_STAT2(R1, R2)	Pearson’s chi-square statistic for observation values in R1 and expectation values in R2
CHI_MAX2(R1, R2)	maximum likelihood chi-square statistic for observation values in R1 and expectation values in R2
CHI_STAT(R1)	Pearson’s chi-square statistic for observation values in R1
CHI_MAX(R1)	maximum likelihood chi-square statistic for observation values in R1
CHI_TEST(R1)	p-value for Pearson’s chi-square statistic for observation values in R1
CHI_MAX_TEST(R1)	p-value for maximum likelihood chi-square statistic for observation values in R1
CHISQ_STAT(R1, R2, chi)	chi-square statistic for the data in column arrays R1 and R2; if chi = TRUE (default), then Pearson’s chi-square statistic is returned; otherwise the maximum likelihood statistic is returned
CHISQ_TEST(R1, R2, chi)	p-value of the chi-square test for independence where R1, R2, and chi are as for the CHISQ_STAT function.
CHISQ_SIM(R1, lab, iter, chi, alpha)	array function that returns a column array with the values: p-value, standard error, and lower/upper ends of a 1–alpha for a simulated quasi-exact chi-square test of independence with iter simulations (default 10,000) based on the contingency table (with headings) in R1.
FISHERTEST(R1, tails, slow)	probability calculated by the Fisher exact test for the contingency table in R1 where for 2 × 2 tables tails = the number of tails: 1 (one-tail) or 2 (two tail, default), tails = 2 for other tables; slow is used to increase the acceptable sum of counts in R1.
FISHER_TEST(R1, lab, slow)	array function that returns a column array containing: p-value for the two-tailed Fisher’s exact test for the data in R1, sample size, df, chi-square statistic, Cramer’s V and w effect size; slow is used to increase the acceptable size of R1.
FISHER_MIDP(R1, tails)	mid p-value for the 2 × 2 contingency table contained in R1. tails = 1 or 2 (default)
ODDS_RATIO(R1, lab, alpha)	array function that returns a column array containing: odds-ratio for the 2×2 contingency table in R1, the standard error of ln OR, and the lower/upper limits of the 1-alpha confidence interval.
CMHTest(R1, lab, yates, alpha)	array function that returns a column array with the values M, p-value, r, lower and upper for the CMH test, where yates = TRUE (default) if the continuity correction is used.
WoolfTest(R1, lab)	array function that returns a column array with the values W, df, p-value for Woolf’s heterogeneity test.

If lab = TRUE (default FALSE), then an extra column of labels is appended. alpha is the significance level (default .05 for CMHTest and .01 for CHISQ_SIM). If chi = TRUE (default) Pearson’s chi-square test is used, while if chi = FALSE the maximum likelihood version of the test is used.

Other non-parametric tests

SignTest(R1, med, tails)	p-value for the one-sample sign test where R1 contains the sample data and med = the hypothesized median
SignTest(R1, R2, tails)	p-value for the paired sign test where R1 and R2 contain sample data
TrinomTest(R1, med, tails)	p-value for the one-sample trinomial test where R1 contains the sample data and med = the hypothesized median
TrinomTest(R1, R2, tails)	p-value for the paired trinomial test where R1 and R2 contain sample data
TRINOM_TEST(d, n0, n, tails)	p-value for the trinomial test where d = \|n+ – n-\|, n0 = # of ties and n = sample size
RANK_COMBINED(x, R1, R2, order)	rank of x in the combined arrays R1 and R2 taking ties into account; order is as in RANK_AVG
RANK_SUM(R1, R2, order)	sum of the ranks of all the elements in R1 based in the combined arrays R1 and R2 taking ties into account; order is as in RANK_AVG
RANK_SUM(R1, k, order)	sum of the ranks of all the elements in the k^th column of R1 taking ties into account; order is as in RANK_AVG
WILCOXON(R1, R2)	minimum of W and W′ for the samples contained in R1 and R2
WILCOXON(R1, n)	minimum of W and W′ for the samples contained in the first n columns of R1 and the remaining columns of R1. If the second argument is omitted it defaults to 1
WTEST(R1, R2, tails)	p-value of the Wilcoxon rank-sum test for the samples contained in R1 and R2
WTEST(R1, n, tails)	p-value of the Wilcoxon rank-sum test for the samples contained in the first n columns of R1 and the remaining columns of R1. If the second argument is omitted it defaults to 1
MANN(R1, R2)	U for the samples contained in R1 and R2
MANN(R1, n)	U for the samples contained in the first n columns of R1 and the remaining columns of R1. If the second argument is omitted it defaults to 1
MWTEST(R1, R2, tails, ties, cont)	p-value of the Mann-Whitney U test for the samples contained in R1 and R2
MW_EXACT(R1, R2, tails)	p-value of the Mann-Whitney exact test for the samples contained in R1 and R2; tails = 1 or 2 (default)
SRANK(R1, med)	T for a single sample contained in R1 minus med. If the second argument is omitted it defaults to zero.
SRTEST(R1, med, tails, ties, cont)	p-value for Signed-Rank test using the normal distribution approximation for the sample contained in R1 minus med; if the second argument is omitted it defaults to zero.
SR_EXACT(R1, med, tails)	p-value for Signed-Rank exact test for the sample in R1 based on a hypothetical median med (default 0)
SRANK(R1, R2)	T for a pair of samples contained in the one-column R1 and R2, representing the paired sample
SRANK(R1)	T for a pair of samples contained in R1, where R1 consists of two columns, one for each paired sample
SRTEST(R1, R2, tails, ties, cont)	p-value for Signed-Rank test using the normal distribution approximation for the pair of samples contained in the one-column arrays R1 and R2, representing the paired sample
SR_EXACT(R1, R2, tails)	p-value for Signed-Rank exact test for the samples in arrays R1 and R2 (if R2 is omitted then R1 must contain two columns, one for each sample).
SRTEST(R1,,tails, ties, cont)	p-value for Signed-Rank test using the normal distribution approximation for the pair of samples contained in R1, where R1 consists of two columns, one for each paired sample
FPSTAT(R1, R2)	p-value of the Fligner-Policello test statistic on the data in column arrays R1 and R2
FPTEST(R1, R2)	p-value of the Fligner-Policello test statistic on the data in column arrays R1 and R2
MOODS_STAT(R1)	chi-square statistic for Mood’s Median Test for the data in R1
MOODS_TEST(R1)	p-value for Mood’s Median Test for the data in R1
COCHRAN(R1, raw, cont)	Cochran’s Q statistic for the data in R1
QTEST(R1, raw, cont)	p-value for Cochran’s Q Test for the data in R1
PERM_TEST(R1, R2, tails)	p-value of the permutation test for the paired data in column arrays R1 and R2
PERM_TEST(R1, hyp, tails)	p-value of the permutation test for the paired data in column arrays R1 against the hypothetical median hyp (default 0)
PERM2_TEST(R1, R2, tails)	p-value of the permutation test for two independent samples with data in column arrays R1 and R2
TOLERANCE_LOWER(R1, p, α, type)	lower-end of tolerance interval for the data in R1 where p = tolerance interval percentage (default .9), type = -1: one-sized non-parametric, -2: two-sided non-parametric, 1: one-sided normal, 2: two-sided normal (default)
TOLERANCE_UPPER(R1, p, α, type)	upper-end of tolerance interval for the data in R1 where arguments are as in TOLERANCE_LOWER
TOLERANCE_SIZE(p, α, tails)	minimum sample size for a non-parametric tolerance interval where p = tolerance interval percentage (default .9)

The argument tails = the # of tails = 2 (default); alpha = significance level (default .05); if ties = TRUE (default) then a ties correction is applied; if cont = TRUE (default) then a continuity correction is applied.

Two data input formats are supported for the COCHRAN, QTEST and QRATIO functions: raw data (raw = TRUE) and summarized data, i.e. a multi-variable frequency table (raw = FALSE, default). If cont = TRUE (default), then a continuity correction is used with two samples (equivalent to McNemar’s Test).

Non-parametric tests (array functions)

MW_TEST(R1, R2, lab, tails, ties, cont, exact, iter, effect)	returns the column array: U-stat, z-stat, r-effect, p-values (normal approximation, exact test, and simulation) for the Mann-Whitney test for the data in R1 and R2
MW_CONF(R1, R2, lab, ttype, alpha)	returns the column array: U-crit, alpha, lower, upper, median, U-crit+1, alpha, lower, upper for the Mann-Whitney test for the data in R1 and R2.
MW_SIMUL(R1, R2, lab, iter)	returns a column array with the U value for R1 and R2, i.e. U = MANN(R1, R2), p-value for the left tailed test, p-value for the right-tailed test, and p-value for the two-tailed test; iter = # of iterations in the simulation (default 10,000).
SR_TEST(R1, R2, lab, tails, ties, cont, exact, iter, effect)	returns the column array: T-stat, z-stat, r-effect, p-values (normal approximation, exact test, and simulation) for the paired Wilcoxon signed ranks test for the data in R1 and R2
SR_TEST(R1, med, lab, alpha, tails, ties, cont, exact, iter, effect)	returns the column array: T-stat, z-stat, r-effect, p-values (normal approximation, exact test, and simulation) for the one-sample signed ranks test for the data in R1 and hypothetical median hyp (default = 0)
SR_CONF(R1, R2, lab, ttype, alpha, nzero)	returns the column array: T-crit, alpha, lower, upper, median, T-crit+1, alpha, lower, upper for the signed-ranks test for the data in R1 and R2. If nzero = TRUE (default), then differences between values in R1 and R2 that are zero are eliminated.
SR_CONF(R1, med, lab, ttype, alpha, nzero)	returns the column array: T-crit, alpha, lower, upper, median, T-crit+1, alpha, lower, upper for the signed-ranks test for the data in R1 with hypothetical median med. If nzero = TRUE (default), then differences between values in R1 and R2 that are zero are eliminated.
SR_SIMUL(R1, R2, lab, iter)	returns a column array with the T-stat for R1 and R2, i.e. T = SRANK(R1, R2), the p-value for the left-tailed test, p-value for the right-tailed test, and p-value for the two-tailed test; iter = # of iterations in the simulation (default 10,000).
SR_SIMUL(R1, med, lab, iter)	returns a column array with the T-stat for R1 and med, i.e. T = SRANK(R1, med), the p-value for the left-tailed test, p-value for the right-tailed test, and p-value for the two-tailed test; iter = # of iterations in the simulation (default 10,000).
CHANGEPT_TEST(R1, lab)	returns a column array with the values: change point, w-stat, z-stat, and p-value for the change point test.
CHANGEPT_BTEST(R1, lab, dist, alpha)	returns a column array with the values: change point, D-stat, D-crit, and p-value for the change point test where R1 contains a time series with binary data.
QRATIO(R1, raw)	returns a row array with the percentages of each of the independent variables in Cochran’s Q test
CLIFF_DELTA(R1, R2, lab, alpha, sym)	returns a column array with Cliff’s delta for the data in R1 and R2, the standard error, and lower/upper limits of the 1-alpha confidence interval; use the symmetric confidence interval if sym = TRUE (default FALSE)

If lab = TRUE then the output includes a column of labels, while if lab = FALSE (the default) only the data is outputted, tails = the # of tails: 1 or 2 (default), alpha is the significance level (default = .05), and if ties = TRUE (default) then a correction for ties is applied. If cont = TRUE (default) then a continuity correction is applied. If exact = TRUE (default FALSE, although this is overridden if the sample size(s) are not too big) then the p-value from an exact test is reported. If iter ≠ 0 then the p-values from a simulation are reported (default is 10,000).

If effect = 0 then the rank-serial correlation effect size is used; if effect = 1 (default) then z/√n is used (more options for signed-rank test).

For MW_CONF, if ttype = 0 (default) then the normal approximation is used; if ttype = 1 then MWINV and MWDIST are used. The situation is similar for SR_CONF.

Runs Tests

RLower_CRIT(n1, n2, α, tails)	lower critical value for the runs test with n1 T’s and n2 F’s, α is the significance level (default .05), tails = 1 or 2 (default)
RUpper_CRIT(n1, n2, α, tails)	upper critical value for the runs test with n1 T’s and n2 F’s, α is the significance level (default .05), tails = 1 or 2 (default)
RUNSTEST(s, lab, tails)	returns the column array: n1, n2, mean, std dev, runs, tails, z-stat, p-value (normal approx.), p-value (exact test) for the one-sample runs test on the data in string s
RUNSTEST(R1, lab, tails)	returns the column array: n1, n2, mean, std dev, runs, tails, z-stat, p-value (normal approximation), p-value (exact test) for the one-sample runs test on the numeric data in R1
RUNS2TEST(R1, R2, lab, iter)	returns the column array: n1, n2, mean, std dev, runs, tails, z-stat, p-value (normal approximation), p-value (exact test) for the two-sample runs test on the data in R1 and R2 when iter = 0 (default); if iter > 0 test is run iter times to deal with ties, one column of output for each unique value for the # of runs
RUNS_TEST(s, tails)	p-value of the runs test for the sequence (with 2 or more values) defined in the string s
RUNS_TEST(R1, tails)	p-value of the runs test for the sequence (with 2 or more values) defined by the column or row array or range R1
RUNS(s, tails)	number of runs in string s (with 2 or more categories)
RUNS(R1, tails)	number of runs in the row or column array or range R1. (with 2 or more categories)
RunsUpDn(R1, lab, tails)	returns the column array: n1, n2, mean, std dev, runs, tails, z-stat, p-value (normal approximation) for the runs up/down test on the numeric data in R1

If lab = TRUE then the output includes a column of labels, while if lab = FALSE (the default) only the data is outputted, tails = # of tails: 1 or 2 (default), alpha is the significance level (default .05).

Statistical Power

CHISQ_POWER(w, n, df, α, iter, prec)	power of a chi-square goodness of fit or independence test where w = Cohen’s effect size and df = degrees of freedom
MW_POWER(effect1, effect2, n1, n2, iter, dist, tails, ties, cont, alpha)	power of Mann-Whitney test with effect sizes effect1 and effect2 for distribution defined by dist (default “norm”) based on iter simulations (default 1000); other arguments as for MW_TEST.
SR_POWER(effect, n, iter, dist, tails, ties, cont, alpha)	power of signed-ranks test with effect size effect for distribution defined by dist (default “norm”) based on iter simulations (default 1000); other arguments as for SR_TEST.
CHISQ_SIZE(w, df, 1−β, α, iter, prec)	minimum sample size required to obtain power of at least 1−β in a chi-square goodness of fit or independence test where w = Cohen’s effect size and df = degrees of freedom

n, n1, n2 = the sample size, tails = # of tails: 1 or 2 (default), α = alpha (default = .05) and iter and prec as for the noncentral distribution functions.

Kolmogorov-Smirnov and Lilliefors functions

KDIST(x, m)	value of the Kolmogorov distribution function at x
KINV(p, m)	inverse of KDIST; i.e. KINV(p, m) = x where 1−KDIST(x, m) = p
KSDIST(x, n)	p-value of the one-sample Kolmogorov-Smirnov test at x for samples of size n
KSINV(p, n)	critical value of the one-sample Kolmogorov-Smirnov test at p for samples of size n
KSSTAT(R1, avg, sd)	statistic for the Kolmogorov-Smirnov test on the data in R1
KSTEST(R1, avg, sd, txt)	p-value for the Kolmogorov-Smirnov test on the data in R1
KS_BSTAT(R1)	statistic for the KS test for Benford’s distribution on the data in R1
KS_BTEST(R1, lab, alpha)	statistic, p-value, and critical value for the KS test for Benford’s distribution on the data in R1
LDIST(x, n)	p-value of the Lilliefors test for normality at x for samples of size n
LINV(p, n)	critical value of the Lilliefors test at p for samples of size n
LTEST(R1)	D-max for the data in R1 based on KS test or Lilliefors test
KSDIST(x, n1, n2, b, m)	p-value of the two-sample Kolmogorov-Smirnov test at x (i.e. D-stat) for samples of size n1 and n2
KSINV(p, n1, n2, b, iter, m)	critical value of the two-sample Kolmogorov-Smirnov test at p for samples of size n1 and n2
KS2TEST(R1, R2, lab, α, b, iter, m)	array function which outputs a column vector with the values D-stat, p-value, D-crit, n1, n2 from the two-sample KS test for the samples in R1 and R2, where α is the significance level (default = .05) and b, iter, and m are as in KSINV.

If R2 is omitted (the default) then R1 is treated as a frequency table. If lab = TRUE (default FALSE) then an extra column of labels is included in the output. m = the # of iterations used in calculating an infinite sum (default = 10).

For KSSTAT and KSTEST, R1 can be either a one-column array containing raw data or a two-column array containing a frequency table. If avg and sd are omitted then the mean and standard deviation are calculated from the data in R1 and the p-value is based on the KS table; otherwise, the values from avg and sd are used and the p-value is based on the Lilliefors table. In either case, the txt argument is treated as for KSPROB or LPROB (see below)

Anderson-Darling Test functions

AD_DIST(x)	p-value of the AD test for the Anderson-Darling statistics x for large samples when the distribution parameters are known
AD_INV(p)	critical value of the AD test at the significance level p for large samples when the distribution parameters are known
ANDERSON(R1, dist, k)	Anderson-Darling AD statistic for the theoretical cdf values in R1 (assumed to be sorted in ascending order) based on the distribution defined by dist. k is the shape parameter (only used for the Gamma distribution)
AD_BSTAT(R1)	Anderson-Darling AD statistic for testing the data in R1 for Benford’s distribution
ADTEST(R1, dist, lab, iter, alpha)	array containing the Anderson-Darling AD statistic, the critical value, and the estimated p-value for the Anderson-Darling test on the data in R1 (not necessarily in sorted order) based on the distribution defined by dist.
AD2TEST(R1, R2, lab, alpha, ties)	column array with the AD statistics for a two-sample AD test, a p-value, the critical value, and the sample sizes. ties = TRUE (default) if a ties correction is applied.
AD2PROB(AD, m, n)	large sample estimate of the p-value for the two sample AD statistic for samples of size m and n
AD2CRIT(m, n, alpha)	large sample estimates of the critical value for the two-sample Anderson-Darling test for samples of size m and n and the significant level alpha (default .05)

If lab = TRUE (default FALSE), then an extra column of labels is appended to the output.

For ADTEST, iter is the number of iterations used in the calculation of an approximate p-value (default 20); if iter = -1 then the pure method of moments is used; if iter = -2, then the regression approach is used (this is only valid for the Weibull distribution) and if iter > 0 (default 20) then an iterative approach is used with iter many iterations (except that no iteration is actually performed for the normal or exponential distribution; instead the result is the same as iter = -1).

Other Goodness of Fit functions

GOFTEST(R1, dist, lab, iter)	array containing the parameters for the distribution specified by dist, the p-value of the chi-square goodness of fit test, and the data value in R1 which has the lowest expected frequency value followed by this expected frequency value
GOFTESTExact(R1, dist, lab, param1, param2)	array containing param1 and param2, p-value of the chi-square goodness of fit test for the distribution specified by dist and the parameter values, and the data value in R1 which has the lowest expected frequency value followed by this expected frequency value.
FIT_TEST(R1, R2, npar)	p-value for the chi-square goodness of fit test where R1 = the array of observed data, R2 = the array of expected values, and npar = the number of unknown parameters.

dist = 0 (default) for a generic distribution with no unknown parameters, dist = 1 for the normal distribution, dist = 2 for the exponential distribution, dist = 3 for the Weibull distribution, and dist = 4 for the gamma distribution, dist = 5 for beta distribution and dist = 6 for uniform distribution, dist = 7 for Gumbel distribution, dist = 8 for Logistic distribution, dist = 9 for log-normal distribution, dist = 10 for Laplace distribution, dist = 11 for Generalized Pareto distribution, dist = 12 for Cauchy distribution, dist = 13 for Benford distribution.

If iter = 0, then the method of moments is used to estimate the unknown parameters; if iter = -1 then the pure method of moments is used; if iter = -2, then the regression approach is used for the Weibull distribution or the weighted ordered statistics estimate is used for the Cauchy distribution, and if iter > 0 (default 20) then an iterative approach is used with iter many iterations (except that no iteration is actually performed for the normal or exponential distribution; instead the result is the same as iter = -1).

In addition, we have the following function where dist = 1, 2, 3, 6, 7, 8, 9, 10, and 12 are as described above, but we also have dist = 20 or “mnorm” that represents a multivariate normal distribution.

ICF_GOF(R1, dist, lab, iter, param1, param2): returns a column array with the values I-stat, I-crit for alpha = .05 and .10 for the distribution specified by dist and the data in the column array R1 (with 10 to 400 elements).

Distribution functions for non-parametric tests

PERMDIST(x,n, cum)	pdf of permutation distribution with n elements at x if cum = FALSE and the cdf value if cum = TRUE
PERMINV(p, n)	inverse of the permutation distribution at p; i.e. the least value of x such that PERMDIST(x, n, TRUE) ≥ p
PERM2DIST(x,n1, n2, cum)	pdf of two sample permutation distribution with n1 and n2 elements at x if cum = FALSE and the cdf value if cum = TRUE
PERM2INV(p, n1, n2)	inverse of the two sample permutation distribution at p; i.e. the least value of x such that PERMDIST(x, n1, n2, TRUE) ≥ p
MWDIST(x,n1, n2, tails)	p-value of the Mann-Whitney exact distribution with n1, n2 elements at x, where tails = 1 or (default)
MWINV(p, n1, n2, tails)	inverse of the Mann-Whitney exact distribution at p; i.e. the least value of x such that MANNDIST(x, n1, n2, TRUE) ≥ p
RUNSDIST(r, n1, n2, cum)	probability of getting r runs from a string with n1 T’s and n2 F’s if cum = FALSE (i.e. the pdf at r) and the probability of getting at most r runs from a string of n1 T’s and n2 F’s if cum = TRUE (i.e. the cdf at r)
RUNSINV(p, n1, n2)	inverse at p of the runs distribution with n1 T’s and n2 F’s

The PERM2DIST, PERM2INV, MWDIST, MWINV functions can also take the forms PERM2DIST(x, n1, n2, cum, FALSE), PERM2INV(p, n1, n2, cum, FALSE), MWDIST(x, n1, n2, tails, FALSE), MWINV(p, n1, n2, tails, FALSE), in which case the Wilcoxon Rank Sum exact test is used instead of the Mann-Whitney exact test.

The following array functions are also supported.

PERM_DIST(n, cum)	returns a column array with the cdf values from 0 to C(n+1,2) of the one-sample permutation distribution when cum = TRUE (default) and the frequency values when cum = FALSE.
PERM2_DIST(n1, n2, cum)	returns a column array with the cdf values from 0 to n1n2 of the two-sample permutation distribution when cum* = TRUE (default) and the frequency values when cum = FALSE.

Table lookup

Here R1 defines a table, including both the data and row/column headings.

INTERPOLATE(r, r1, r2, v1, v2, h)	the value between v1 and v2 that are proportional to the distance that r is between r1 and r2, where v1 corresponds to r1 and v2 corresponds to r2 based on the interpolation defined by h where h = 0 means linear interpolation, h = 1 (default) means log interpolation and h = 2 means harmonic interpolation.
MLookup(R1, r, c)	the value in the table defined by R1 in the row headed by r and the column headed by c.
ILookup(R1, r, c, hc, hr)	the value in the table defined by R1 corresponding to row r and column c. If r or c can refer to some value that must be interpolated between row or column headings (using the INTERPOLATE function with h = hr for rows and h = hc for columns), provided those headings are numbers. If the first row (or column) heading is preceded by “>” it refers to values smaller than the next row (or column heading). If the last row (or column) heading is preceded by “>” it refers to values bigger than the previous row (or column heading).

There are also the following lookup functions for specific tables. In the following n, n1, n2, k, and df are generally positive integers, α is a decimal between 0 and 1 (default .05) and tails takes the value 1 or 2 (default). If α exceeds the largest value for alpha in the table or is smaller than the smallest value for alpha in the table then #N/A is returned. For values between those in the associated table, if interp = TRUE (default) the recommended interpolations are used, while if iterp = FALSE a linear interpolation is used.

TauCRIT(n, α, tails, interp)	critical value in Kendall’s Tau table
RhoCRIT(n, α, tails, interp)	critical value in Spearman’s Rho table
KSCRIT(n, α, tails, interp)	critical value in the One-Sample Kolmogorov-Smirnov table
KS_BCRIT(n, α, interp)	critical value for the KS test of the Benford distribution
LCRIT(n, α, tails, interp)	critical value in the Lilliefors table
QCRIT(k, df, α, tails, interp)	critical value in the Studentized Range Q table
DCRIT(k, df, α, interp)	critical value in the two-sided Dunnett’s test table
DLowerCRIT(n, k, α, interp)	lower critical value in the Durbin-Watson Table
DUpperCRIT(n, k, α, interp)	upper critical value in the Durbin-Watson Table
MSSD_CRIT(n, alpha, interp)	critical value of the z-statistic of the MSSD test
ADCRIT(n, α, dist, k, interp)	critical value in the one-sample Anderson-Darling Tables where dist and k are as for ANDERSON.
AD2CRITX(m, n, α)	critical value in the two-sample Anderson-Darling Table for small samples of size m and n
KS2CRIT(n, α, tails, interp)	critical value in the two Sample Kolmogorov-Smirnov table
PageCRIT(k, n, α, interp)	critical value in Page’s L test table for k within-subjects groups and n samples

Related to the above critical values determined by one of the statistics tables are the following estimates of p-values based on linear interpolation, if necessary, of the values in these statistics tables. The parameter iter = the # of iterations used to arrive at the p-value. When txt = FALSE (default) if the p-value exceeds the largest value for alpha in the table then a value of 1 is returned and if the p-value is less than the smallest value for alpha in the table then 0 is returned, while when txt = TRUE then the output takes a form such as “< .01” or “> .2”. If interp = TRUE (default), then the recommended interpolations are used; otherwise, linear interpolation is used.

KSPROB(x, n, tails, iter, interp, txt)	p-value for the One Sample Kolmogorov-Smirnov test
KS_BPROB(x, n, iter, interp, txt)	p-value for the KS test for Benford’s distribution
LPROB(x, n, tails, iter, interp, txt)	p-value for the Lilliefors test
QPROB(q, k, df, tails, iter, interp, txt)	p-value for the Studentized Range Q
SWPROB(n, W, roy, interp)	p-value for the Shapiro-Wilk test for a sample of size n and statistic W; if roy = TRUE then use the Royston algorithm is used, while if roy = FALSE then a table lookup (table 2) is used
DPROB(q, k, df, iter, interp, txt)	p-value for Dunnett’s Test using the two-sided Dunnett’s table
MSSD_PROB(z, n, iter, interp, txt)	p-value of the MSSD test based on the z-statistic
ADPROB(x, dist, k, iter, interp, txt)	p-value for the Anderson-Darling test where dist and k are as in ANDERSON, except that k is also used to specify the sample size for the Laplace or GPD distributions.
KS2PROB(x, n1, n2, iter, interp, txt)	p-value for the Two Sample Kolmogorov-Smirnov test
PagePROB(x, k, n, iter, interp, txt)	p-value for Page’s L test for k within-subjects groups and n samples

For the following, an exact p-value is returned (i.e. it is not estimated from the critical value).

KENDALLU_PROB(k, n,u)	p-value for Kendall’s u test for paired comparisons when the number of raters is k and the number of subjects being rated is n for the specified value of u.
TC_PROB(k, n, Tc)	p-value for T_C test when the number of judges is k and the number of criteria being rated is n for the specified value of T_C.

3 thoughts on “Real Statistics Non-Parametric Test Functions”

Giacomo Diaz

December 27, 2022 at 11:06 am

Thanks, Daniel.
I’m using your splendid package intensively.
Giacomo
Giacomo Diaz

December 11, 2022 at 9:04 pm

I don’t find the KRUSKAL function, that was previoulsy included among non-parametric tests.
Is it now present in other sections?
- Charles
  
  December 12, 2022 at 12:23 pm
  
  Hi Giacomo,
  KRUSKAL is still supported. You can find the reference to this function at Real Statistics Regression/ANOVA Functions.
  I have also included a link to list webpage.
  Charles