Real Statistics Distribution Functions

The following is a summary of worksheet functions provided in the Real Statistics Resource Pack that support probability distributions. These include functions that carry out parametric tests.

These functions are organized into the following categories:

  • Normal distribution
  • t distribution
  • Chi-square and F distributions
  • Coefficient of variation
  • Noncentral distribution functions
  • Statistical power functions
  • Sample size functions
  • Weibull distribution
  • Other distributions
  • Distribution fitting functions
  • Confidence intervals for fitted parameters
  • Order statistics for continuous and discrete distributions
  • Order statistics for finite distributions

Worksheet functions that support distributions for non-parametric test parameters or pertain to statistical table lookup can be found at Real Statistics Non-parametric Functions.

Normal distribution

NORM_CONF(R1, alpha) value k such that (k, k) is the confidence interval for the population mean based on the specified alpha value and the sample data in R1 assuming a normal distribution
NORM_LOWER(R1, alpha) value  − k such that ( − k,  + k) is the confidence interval for the population mean based on the specified alpha value and the sample data in R1 assuming a normal distribution
NORM_UPPER(R1, alpha) value k such that ( − k,  + k) is the confidence interval for the population mean based on the specified alpha value and the sample data in R1 assuming a normal distribution
HNORM_DIST(x, µ, σ, cum, a, b) cdf of the half-normal distribution N(µ, σ2, a, b) at x when cum = TRUE, and pdf of the distribution at x when cum = FALSE
HNORM_INV(p, µ, σ, a, b) value of the inverse distribution function of the half-normal distribution N(µ, σ2, a, b) at p
TNORM_DIST(x, µ, σ, cum, a, b) cdf of the truncated normal distribution N(µ, σ2, a, b) at x when cum = TRUE, and pdf of the  distribution at x when cum = FALSE
TNORM_INV(p, µ, σ, a, b) value of the inverse distribution function of the truncated normal distribution N(µ, σ2, a, b) at p
TNORM_PARAM(p, µ, σ, a, b, lab) array function that returns a column array consisting of the mean, median, mode, variance, skewness, and kurtosis of the truncated normal distribution N(µ, σ2, a, b)
TOLERANCE_NORM (n, p, alpha, type) value k such that (k, + k) is the tolerance interval for the normal distribution where n = sample size; p = tolerance (default .9); type = 2 (default) for two-sided interval, type = 1 for a one-side interval using a non-central t distribution and type = 0 for a one-sided interval using the Natrella approach.
MSSD(R1, lab) array function that returns a column array with MSSD of R1, the variance of R1, z-statistic and p-value for the MSSD test 
SHAPIRO(R1, roy) Shapiro-Wilk test statistic W for the data in R1
SWTEST(R1, roy, h) the p-value of the Shapiro-Wilk test on the data in R1
SWCoeff(n, k, roy) kth Shapiro-Wilk coefficient for samples of size n
SWCoeff(R1, C1, roy) Shapiro-Wilk coefficient for cell C1 within the ordered sample in R1
JARQUE(R1, pop) Jarque-Barre statistic for the data in R1
JBTEST(R1, pop) p-value of the Jarque-Barre test on the data in R1
SKEWTEST(R1, lab, alpha) array function to test whether the skewness of the data in R1 is zero (consistent with a normal distribution); output contains sample skewness, standard error, test statistic, p-value and 1–alpha confidence interval
KURTTEST(R1, lab, alpha) array function to test whether the kurtosis of the data in R1 is zero (consistent with a normal distribution); output contains sample kurtosis, standard error, test statistic, p-value and 1–alpha confidence interval
SKEWPTEST(R1, lab) array function to test whether the skewness of the data in R1 is zero (consistent with a normal distribution); output contains population skewness, test statistic, and p-value
KURTPTEST(R1, lab) array function to test whether the kurtosis of the data in R1 is zero (consistent with a normal distribution); output contains population kurtosis, test statistic, and p-value
DAGOSTINO(R1, pop) d’Agostino-Pearson statistic for the data in R1; if pop = TRUE (default) then the population version is used
DPTEST(R1, pop) p-value of the d’Agostino-Pearson test on the data in R1; if pop = TRUE (default) then the population version is used

In the above functions, when roy = TRUE (default) the Royston algorithm is used, and when roy = FALSE the original Shapiro-Wilk algorithm is used (employing harmonic interpolation unless h = FALSE when linear interpolation is used).

T distribution

CONFIDENCE_T(alpha,  s, size) value k such that ( − k, + k) is the confidence interval for the population mean based on the specified alpha value, standard dev s and sample size, assuming a t distribution
T_CONF(R1, alpha) value k such that ( − k, + k) is the confidence interval for the population mean based on the stated alpha value and sample data in R1 assuming a t distribution
T_LOWER(R1, alpha) value − k such that ( − k, + k) is the confidence interval for the population mean based on the stated alpha value and sample data in R1 assuming a t distribution
T_UPPER(R1, alpha) value k such that ( − k, + k) is the confidence interval for the population mean based on the stated alpha value and sample data in R1 assuming a t distribution
DF_POOLED(R1, R2) degrees of freedom for the two-sample t-test for samples in R1 and R2, especially when the two samples have unequal variances
STDERR_POOLED(R1, R2, b) pooled standard error for two-sample t-test for samples in R1 and R2; if b = TRUE assume equal variances, while if b = FALSE assume unequal variances
STDEV_POOLED(R1, R2) pooled standard deviation for the two-sample t-test for samples in R1 and R2 when the two samples have equal variances
VAR_POOLED(R1, R2) pooled variance for the two-sample t-test for samples in R1 and R2 when the two samples have equal variances
T_DIST(x, df, cum) cdf of the t distribution with df degrees of freedom at x when cum = TRUE, and pdf of the  distribution at x when cum = FALSE
T3_DIST(x, df, µ, σ, cum) cdf of the non-standardized t distribution T(df, µ, σ) at x when cum = TRUE, and pdf of the  distribution at x when cum = FALSE
T3_INV(p, df, µ, σ, cum) value of the inverse distribution function of the non-standardized t distribution T(df, µ, σ) at p

T_DIST is like Excel’s T.DIST function except that df doesn’t have to be an integer. In addition, the following functions are available which have a similar advantage over their Excel counterpart: T_DIST_RT, T_DIST.2T, T_INV, T_INV.2T.

The following array functions are used to calculate confidence intervals for Cohen’s effect size. The output takes the form of a column array with the values Cohen’s d, Hedges’ g, and the lower and upper confidence interval limits for d. If lab = TRUE (default FALSE) then a column of labels is appended to the output. alpha = significance level (default .05).

The last three arguments of each of these functions are as for the NT_NCP function, except that if iter = 0 then the Hedges and Olkin estimate of the confidence interval is employed (iter0 and prec are not used), while if iter > 0 (default 1000) then the estimate of the confidence interval described above using the noncentrality parameter. Note that there is no Hedges and Olkin estimate for T_EFFECT2P and TT_EFFECT2P.

T_EFFECT1(m, s, n, lab, alpha,  iter, iter0, prec) based on a one-sample t-test for data with mean m, standard deviation s, and sample size n.
T_EFFECT2(m1, s1, n1, m2, s2, n2, lab, alpha, iter, iter0, prec) based on a two independent sample t-test for sample 1 with mean m1, standard deviation s1, and sample size n1, and sample 2 with mean m2, standard deviation s2, and sample size n2.
T_EFFECT2P(m1, m2, s1, s2, r, n, lab, alpha, iter, iter0, prec) based on dav effect size for a paired sample t-test for sample 1 with mean m1, standard deviation s1,  and sample size n, and sample 2 with mean m2, standard deviation s2, and sample size n. r = the correlation coefficient between samples 1 and 2.
TT_EFFECT1(R1, hyp, lab, alpha, 0) =T_EFFECT(m, s, n, lab, alpha, 0) where m = AVERAGE(R1) – hyps = STEV.S(R1) and n = COUNT(R1).
TT_EFFECT2(R1, R2, lab, alpha,  iter, iter0, prec) =T_EFFECT2(m1, s1, n1, m2, s2, n2, lab, alpha, iter, iter0, prec) where m1 = AVERAGE(R1), s1 = STEV.S(R1), n1 = COUNT(R1), m2 = AVERAGE(R2), s2 = STEV.S(R2) and n2 = COUNT(R2)
TT_EFFECT2P(R1, R2, lab, alpha, iter, iter0, prec) =T_EFFECT2P(m1, m2, s1, s2, r, n, lab, alpha, iter, iter0, prec) where m1 = AVERAGE(R1), s1 = STEV.S(R1), n = COUNT(R1) = COUNT(R2), m2 = AVERAGE(R2), s2 = STEV.S(R2) and r = CORREL(R1, R2).

The following array functions are used to perform the Trimmed t-test and Yuen-Welch tests. Here R1 and R2 are arrays containing the data for two independent samples. If lab = TRUE (default = FALSE) then an extra column is added with labels. tails = the number of tails (1 or 2, default 2). p = trim percentage (default 0.2)

TRIMTEST(R1, R2, lab, p, tails) outputs a column array with the values: t-stat, df, p-value for the trimmed means t-test
YUENTEST(R1, R2, lab, p, tails) outputs a column array with the values: t-stat, df, p-value for Yuen-Welch’s t-test

There are also the following array formulas for identifying outliers:

GRUBBS(R1, lab, alpha) outputs a 4 × 1 column array with the following entries: one potential outlier, G-stat, G-crit,  and test significance
ESD(R1, lab, alpha, k) outputs a 4 × k array where k = max #  of potential outliers and each column contains: a potential outlier, G-stat, G-crit,  and test significance (based on ESD test); if k is omitted then k = the # of data rows in highlighted range
OUTLIERS(R1, alpha, k) outputs a column array with up to k outliers from array R1 (based on the ESD test); if k is omitted then k = the # of rows in highlighted range

If lab = TRUE (default = FALSE) then an extra column is added with labels.

Chi-square and F distributions

CHISQ_DIST(x, df, cum) equivalent to CHISQ.DIST(x, df, cum) except that df does not need to be an integer
CHISQ_INV(p, df) equivalent to CHISQ.INV(p, df) except that df does not need to be an integer
F_DIST(x, df1, df2, cum) equivalent to F.DIST(x, df1, df2, cum) except that df1 and df2 do not need to be integers.
F_DIST(p, df1, df2) equivalent to F.INV(p, df1, df2) except that df1 and df2 do not need to be integers.

The following Real Statistics functions are also similar to their Excel counterpart except that df does not need to be an integer: CHISQ_DIST_RT and CHISQ_INV.RT. The same is true for F_DIST_RT and F_INV_RT.

Coefficient of Variation

CVTEST(R1, lab, alpha) returns an array with the sample CV, the unbiased CV, standard error, p-value, and the lower and upper bounds of the 1-alpha confidence interval for the one sample CV test based on the data in R1
CV2TEST(R1, R2, lab, alpha) returns an array with the CV for sample 1, the CV for sample 2, the pooled CV, the z-statistic, p-value, and the lower and upper bounds of the 1-alpha confidence interval for the two-sample CV test based on the data in R1 and R2.

alpha = the significance level (default .05). If lab = TRUE (default FALSE) then a column of labels is appended to the output.

Noncentral distribution functions

NT_DIST (t, df, δ, cum, iter, prec) value of the noncentral t distribution T(df,δ) at t if cum = TRUE and pdf at t if cum = FALSE
NCHISQ_DIST (x, df, λ, cum, iter, prec) value of the noncentral chi-square distribution χ2(df,λ) at x if cum = TRUE and pdf at x if cum = FALSE
NF_DIST(x, df1, df2, λ, cum, iter, prec) value of the noncentral F distribution F(df1,df2) at x if cum = TRUE and pdf at x if cum = FALSE
NT_INV (p, df, δ, iter, iter0) inverse of the noncentral t distribution T(df,δ) at p
NCHISQ_INV (p, df, λ, iter, iter0, prec) inverse of the noncentral chi-square distribution χ2(df,λ) at p
NF_INV(p, df1, df2, λ, iter, iter0, prec) inverse of the noncentral F distribution F(df1,df2) at p
NT_NCP (p, df, x, iter, iter0, prec) ncp value δ such that NT_DIST(x, df, δ, TRUE, m) = p
NCHISQ_NCP (p, df, x, iter, iter0, prec) ncp value λ such that NCHISQ_DIST(x, df, λ, TRUE, m) = p
NF_NCP(p, df1, df2, x, iter, iter0, prec) ncp value λ such that NF_DIST(x, df1, df2, λ, TRUE, m) = p

iter = maximum number of terms in the infinite sum that will be calculated (default 1000), prec = desired level of accuracy for the calculation (default 0.000000001 for the χ2 and F distribution and 0.000000000001 for the t distribution). iter0 = the number of iterations used to calculate the inverse or noncentrality parameter (default 40).

Statistical power functions

NORM1_POWER(d, n, tails, α) power of a one-sample normal test when d = Cohen’s effect size
NORM2_POWER(mm, s1, s2, n1, n2, tails, α) power of a two-sample normal test when mm = |μ2μ1| and s1 and s2 are the corresponding population standard deviations
BINOM_POWER(p0, p1, n, tails, α) power of a one-sample binomial test when p0 = probability of success on a single trial based on the null hypothesis, p1 = expected probability of success on a single trial
CORREL1_POWER(r0, r1, n, tails, α) power of a one-sample correlation test using the Fisher transformation when r0 = population correlation (based on the null hypothesis), r1 = effect size (observed correlation)
T1_POWER(d, n, tails, α, iter, prec) power of a one-sample t-test when d = Cohen’s effect size
T2_POWER(d, n1, n2, tails, α, iter, prec) power of a two-sample t-test when d = Cohen’s effect size; if n2 is omitted or set to 0 then n2 is considered to be equal to n1
T1_POWER_Conf(d, n, tails, lab, alpha, iter, prec) a column array containing the power of a one-sample t-test, plus the lower and upper  confidence interval limits of the power, where if lab = TRUE (default FALSE) then a column of labels is appended to the output and all the other arguments are as for T1_POWER
T2_POWER_Conf(d, n1, n2, tails, lab, alpha, iter, prec a column array containing the power of a two independent sample t-test, plus the lower and upper confidence interval limits of the power, where if lab = TRUE (default FALSE) then a column of labels is appended to the output and all the other arguments are as for T2_POWER
VAR1_POWER(ratio, n, tails, α) power of a one-sample variance test where ratios^2/\sigma_0^2 (effect size)
VAR2_POWER(ratio, n1, n2, tails, α) power of a two-sample variance test where ratio = \sigma_2^2/\sigma_1^2 (effect size)

n, n1, n2 = the sample size, tails = # of tails: 1 or 2 (default), α = alpha (default = .05) and m and prec as for the noncentral distribution functions.

Sample size functions

NORM1_SIZE(d, 1−β, tails, α) minimum sample size required to obtain power of at least 1−β in a one-sample normal test when d = Cohen’s effect size
NORM2_SIZE(mm, s1, s2, 1−β, tails, α, nratio) minimum sample size required to obtain power of at least 1−β in a two-sample normal test when mm = |μ2μ1| and s1 and s2 are the corresponding population standard deviations
BINOM_SIZE(p0, p1, 1−β, tails, α) minimum sample size required to obtain power of at least 1−β in a one-sample binomial test when p0 = probability of success on a single trial based on the null hypothesis, p1 = expected probability of success on a single trial
CORREL1_SIZE(r0, r1, 1−β, tails, α) minimum sample size required to obtain power of at least 1−β in a one sample correlation test using the Fisher transformation when r0 = population correlation (based on the null-hypothesis), r1 = effect size (observed correlation)
T1_SIZE(d, 1−β, tails, α, iter, prec) minimum sample size required to obtain power of at least 1−β in a one-sample t-test when d = Cohen’s effect size
T2_SIZE(d, 1−β, tails, α, nratio, iter, prec) minimum sample size required to obtain power of at least 1−β in a two-sample t-test when d = Cohen’s effect size
VAR1_SIZE(ratio, 1−β, tails, α) minimum sample size required to obtain power of at least 1−β in a one-sample variance test where ratio = s^2/\sigma_0^2 (effect size)
VAR2_SIZE(ratio, 1−β, tails, α, nratio) minimum sample size required to obtain power of at least 1−β in a two-sample variance test where ratio =\sigma_2^2/\sigma_1^2 (effect size)

tails = # of tails: 1 or 2 (default), α = alpha (default = .05), iter and prec as for the noncentral distribution functions, nratio = the size of the second sample divided by the size of the first sample (default = 1) or negative of a specific value for the size of the second sample; default for 1−β = .80.

Weibull distribution functions

WEIBULL_INV(pα, β) inverse at p of the Weibull distribution with parameters α and β
WEIBULL_MRL(xα, β) the mean residual life of an element that follows a Weibull distribution with parameters α and β that has not yet failed at time x
WEIBULL_CMEAN(R1, α, β, ncensor, censor) expected mean of data in R1 that follows a Weibull distribution with parameters α and β when augmented by ncensor censored elements at time censor.
WEIBULL_CVAR(R1, α, β, ncensor, censor, iter) expected variance of data in R1 that follows a Weibull distribution with parameters α and β when augmented by ncensor censored elements at time censor based on a Monte Carlo simulation of iter random draws.

Other distribution functions

MULTINOMDIST(R1, R2) value of multinomial distribution where R1 contains the number of successes and R2 contains the corresponding probabilities of success
UNIFORM_DIST(xα, β, cum) pdf value at x for uniform distribution with endpoints α and β when cum = FALSE and corresponding cdf value when cum = TRUE
UNIFORM_INV(pα, β) inverse at p of the uniform distribution with endpoints α and β
POISSON_INV(p, λ) inverse at p of the Poisson distribution with mean λ
NEGBINOM_INV(p, k, pp) inverse at p of the negative binomial distribution with parameters k and pp
EXPON_INV(p, lambda) inverse at p of the exponential distribution with parameter lambda
HYPGEOM_INV(pn, k, m) inverse at p of the hypergeometric distribution with parameters n, k and m
BETA(x, y) value of the beta function at x and y
QDIST(q, k, df) studentized q cumulative distribution value for q with k independent variables and df degrees of freedom
QINV(p, k, df, tails) inverse of the studentized q distribution, i.e. the critical value for the studentized q range; tails = 1 or 2 (default)
PERT_DIST(x, a, b, c, cum) pdf/cdf of the PERT distribution at x based on the parameters a (minimum), b (mode) and c (maximum). If cum = TRUE then the cdf is returned, while if cum = FALSE then the pdf is returned.
PERT_INV(p, a, b, c) inverse at p of the PERT distribution based on PERT_DIST
TRIANG_DIST(x, a, b, c, cum pdf/cdf of the triangular distribution at x based on the parameters a (minimum), b (mode), and c (maximum). If cum = TRUE then the cdf is returned, while if cum = FALSE then the pdf is returned.
TRIANG_INV(p, a, b, c) inverse at p of the triangular distribution based on TRIANG_DIST
GUMBEL_DIST(x, mu, beta, cum) Gumbel distribution at x
GUMBEL_INV(p, mu, beta) inverse of the Gumbel distribution at p
LAPLACE_DIST(x, mu, beta, cum) Laplace distribution at x
LAPLACE_INV(p, mu, beta) inverse at p of the Laplace distribution
LOGISTIC_DIST(x, mu, beta, cum) Logistic distribution at x
LOGISTIC_INV(p, mu, beta) inverse of the Logistic distribution at p
GEV_DIST(x, mu, sigma, xi, cum) Generalized Extreme Value distribution at x
GEV_INV(p, mu, sigma, xi) inverse of the Generalized Extreme Value distribution at p
FRECHET_DIST(p, alpha, beta, gamma, cum) Frechet distribution at x
FRECHET_INV(p, alpha, beta, gamma) inverse of the Frechet distribution at p
PARETO_DIST(x, alpha, mn, cum) Pareto distribution at x
PARETO_INV(p, alpha, mn) inverse at p of the Pareto distribution
IGAMMA_DIST(x, alpha, beta, cum) inverse gamma distribution at x
IGAMMA_INV(p, alpha, beta) inverse of the inverse gamma distribution at p
ICHISQ_DIST(x, df, cum) inverse chi-square distribution at x
ICHISQ_INV(p, df) inverse at p of the inverse chi-square distribution cdf
DIRICHLET_DIST(pvector, avector) Dirichlet distribution at pvector (whose values add up to 1) based on alpha parameters in avector
DIRICHLET_RAND(avector) an array pvector consisting of random values for the Dirichlet distribution with alpha values in avector

The following functions return the mean and variance for a variety of distributions.

MEAN_DIST(dist, param1, param2, param3) the mean of the distribution dist based on the listed parameters. dist takes the values “norm”, “chisq”, “t”, etc. The values of the parameters depend on the value of dist.
VAR_DIST(dist, param1, param2, param3) the variance of the distribution dist based on the listed parameters. dist takes the values “norm”, “chisq”, “t”, etc. The values of the parameters depend on the value of dist.
GEVSKEW(xi) the population skewness for the GEV distribution based on the xi parameter

The following array functions are also supported.

PERM_DIST(n, cum) returns a column array with the cdf values from 0 to C(n+1,2) of the one-sample permutation distribution when cum = TRUE (default) and the frequency values when cum = FALSE.
PERM2_DIST(n1, n2, cum) returns a column array with the cdf values from 0 to n1*n2  of the two-sample permutation distribution when cum = TRUE (default) and the frequency values when cum = FALSE.

Distribution fitting functions

BETA_FITM(R1, lab, pure) an array consisting of the estimated beta distribution alpha and beta values based on the method of moments, plus MLE.
GAMMA_FITM(R1, lab, pure) an array consisting of the estimated gamma distribution alpha and beta values based on the method of moments, plus MLE.
UNIFORM_FITM(R1, lab, pure) an array consisting of the estimated uniform distribution alpha and beta values based on the method of moments, plus MLE.
TRIANG_FITM(R1, lab, lo, hi, iter, exta, extc, iterb, iterc) an array consisting of the estimated triangular distribution parameters (a, b, c) based on the method of moments, plus actual and estimated mean, variance, skewness, and MLE.
PERT_FITM(R1, lab, lo, hi, iter, exta, extc, iterb, iterc) an array consisting of the estimated PERT distribution parameters (a, b, c) based on the method of moments, plus actual and estimated mean, variance, skewness, and MLE.
WEIBULL_FITM(R1, lab, pure, iter, bguess) an array consisting of the estimated Weibull distribution alpha and beta values based on the method of moments, plus MLE. 
GUMBEL_FITM(R1, lab, pure) an array consisting of the estimated Gumbel distribution mu and beta values based on the method of moments, plus MLE. 
GEV_FITM(R1, lab, xguess, iter, prec, incr, pure) an array consisting of the estimated GEV distribution mu, sigma, and xi values based on the method of moments, plus MLE. 
LAPLACE_FITM(R1, lab, pure) an array consisting of the estimated Laplace distribution mu and beta values based on the method of moments, plus MLE. 
LOGISTIC_FITM(R1, lab, pure) an array consisting of the estimated Logistic distribution mu and beta values based on the method of moments, plus MLE. 
PARETO_FITM(R1, lab) an array consisting of the estimated Pareto distribution alpha and mn values based on the method of moments, plus MLE. 
CAUCHY_FITM(R1, lab, med, exc) an array consisting of the estimated Cauchy distribution mu and sigma values based on a modified method of moments, plus MLE. If med = TRUE, then the median of R1 is used to estimate mu; otherwise, the 76% trimmed mean is used (default FALSE). If exc = TRUE, the exclusive IQR is used to estimate sigma; otherwise, the inclusive IQR is used (default FALSE).
BETA_FIT(R1, lab, iter, aguess, bguess) an array consisting of the estimated beta distribution alpha and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE.
GAMMA_FIT(R1, lab, iter, aguess) an array consisting of the estimated gamma distribution alpha and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE.
EXPON_FIT(R1, lab) an array consisting of the estimated exponential distribution lambda value, plus actual variance (sample and population), estimated variance, and MLE.
UNIFORM_FIT(R1, lab, iter) an array consisting of the estimated uniform distribution alpha and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE.
TRIANG_FIT(R1, lab, lo, hi, iter, exta, extc, iterc) an array consisting of the estimated triangular distribution parameters (a, b, c) based on the MLE approach, plus actual and estimated mean, variance, skewness, and MLE.
PERT_FIT(R1, lab, lo, hi, iter, exta, extc, iterb, iterc)
an array consisting of the estimated PERT distribution parameters (a, b, c) based on the MLE approach, plus actual and estimated mean, variance, skewness, and MLE.
GUMBEL_FIT(R1, lab, iter, bguess) an array consisting of the estimated Gumbel distribution mu and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE.
GEV_FIT(R1, lab, iter, prec, incr, mguess, sguess, xguess) a 3 × 4 array consisting of the estimated GEV distribution mu, sigma, and xi values and their standard errors based on the MLE approach.
LAPLACE_FIT(R1, lab) an array consisting of the estimated Laplace distribution mu and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE.
GEOM_FIT(R1, lab) an array consisting of the estimated geometric distribution p parameter value, plus actual and estimated mean and variance, and MLE.
LOGISTIC_FIT(R1, lab, iter, bguess, mguess) an array consisting of the estimated Logistic distribution mu and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE.
PARETO_FIT(R1, lab) an array consisting of the estimated Pareto distribution alpha and mn values based on the MLE approach, plus actual and estimated mean and variance, and MLE.
CAUCHY_FIT(R1, lab, iter, sguess, mguess) an array consisting of the estimated Cauchy distribution mu and sigma values based on the MLE approach, plus actual median, 75% trimmed mean, inclusive and exclusive IQR, and MLE.
WEIBULL_FIT(R1, lab, iter, bguess, ncensor, censor) an array consisting of the estimated Weibull distribution alpha and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE; ncensor = the number of censored data elements (default 0) and censor = time of censoring
WEIBULL_FITR(R1, lab, benard) an array consisting of the estimated Weibull distribution alpha and beta values based on regression, plus actual and estimated mean and variance, and MLE and R-square; if benard = TRUE (default) then the Benard approximation is used.
CAUCHY_FITX(R1, lab) an array consisting of the estimated Cauchy distribution mu and sigma values based on the weighted ordered statistics approach, plus MLE.
POLYGAMMA(z, k) digamma function at z if k = 0 (default), and trigamma function at z if k = 1.
KDE(R1, R2, h, ktype) array function that returns a column array with f(x) pdf values corresponding to the x values in column array R1 based on the sample data in column array R2, the bandwidth h, and the kernel specified by ktype.

If lab = TRUE, then an extra column of labels is appended to the output (default is FALSE). If pure = TRUE, then the pure method of moments is used (default is FALSE). MLE is the likelihood estimate for the parameters that are output. 

iter is the number of iterations used in calculating the solution; for UNIFORM_FIT if iter = 0 then alpha and beta are set to the minimum and maximum data values (biased MLE). aguess is the initial guess for alpha (use 0 for default initial guess), bguess is the initial guess for beta (use 0 for default initial guess),  sguess is the initial guess for sigma (use 0 for default initial guess), xguess is the initial guess for xi, and mguess us the initial guess for mu.

The parameters for TRIANG_FIT, TRIANG_FITM, PERT_FIT, and PERT_FITM are a little different from the others. See the links for more details. Related to these functions are the following two worksheet functions:

TRIANG_MLE(R1,  a, b, c) LL for a triangular distribution with parameters a, b, and c which fits the data in array R1.
PERT_MLE(R1, a, b, c) LL for a PERT distribution with parameters a, b, and c which fits the data in array R1.

Confidence intervals for fitted parameters

BETA_CONF(alfa, beta, n, lab, ttype, iter, alpha, fiter) an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the alfa and beta parameters of the beta distribution
EXPON_CONF(lambda, n, lab, iter, alpha) an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the lambda parameter of the exponential distribution
GAMMA_CONF(alfa, beta, n, lab, ttype, iter, alpha, fiter) an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the alfa and beta parameters of the gamma distribution
GEOM_CONF(p, n, lab, ttype, iter, alpha) an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the p parameter of the geometric distribution
GUMBEL_CONF(mu, beta, n, lab, ttype, iter, alpha, fiter) an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the mu and beta parameters of the Gumbel distribution
LAPLACE_CONF(mu, beta, n, lab, ttype, iter, alpha) an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the mu and beta parameters of the Laplace distribution
LOGISTIC_CONF(mu, beta, n, lab, ttype, iter, alpha, fiter) an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the mu and beta parameters of the Logistic distribution
PARETO_CONF(alfa, mn, n, lab, ttype, iter, alpha) an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the alfa and mn parameters of the Pareto distribution
LOGNORM_CONF(mu, sigma, n, lab, ttype, iter, alpha) an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the mu and sigma parameters of the log-normal distribution
NORMAL_CONF(mu, sigma, n, lab, iter, alpha) an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the mu and sigma parameters of the normal distribution
UNIFORM_CONF(alfa, beta, n, lab, ttype, iter, alpha, fiter) an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the alfa and beta parameters of the uniform distribution

n = sample size. iter = # of bootstrap iterations (default 1000). ttype = 0 (MLE, default), 1 (method of moments), 2 (pure method of moments), and 3 (regression). alpha defaults to .05. Each parameter is represented by one column with 4 rows in the output; if lab = TRUE a column of row labels and a row of column labels are appended to the results (default FALSE). fiter is the # of iterations required to fit data to the specific distribution; e.g. GUMBEL_FIT(R1, lab, fiter).

Bayesian Statistics

SAMPLE_HDI(R1, lab, p) array function which returns a column array with the endpoints of the 1–p HDI based on the data in R1
GRID_HDI(R1, R2, lab, p, lprec, uprec) array function which returns a column array with the following entries: endpoints of the HDI, length of the HDI, and the actual value for 1–p based on the HDI. lprec and hpred specify the tolerances for p on the left and right (default .1).
BETA_HDI(p, alpha, beta, lab, iter)
array function that returns a column array with the following entries: endpoints of the 1– p HDI, length of the HDI, and the pdf values at the two endpoints. iter = the number of iterations (default 40)
ESS(R1, cutoff) the effective sample size for the sample in R1 produced by the Metropolis algorithm; the ACF values are summed while ACF(k) < cutoff  (default .05)
NORM_GAMMA(μ0φ0n0mφ, n, lab) a column array with the posterior values of the mean, variance, and size based on the NormGamma prior with parameters μ0φ0n0 and sample data of size n with mean and variance φ.

The default for p is .05; if lab = TRUE (default FALSE) then a column of labels is appended to the output.

Order Statistics (Continuous or Discrete Distribution)

ORDER_DIST(x, k, n, cum, dist, param1, param2, param3) pdf f(x) for the kth order statistic x(k) from a sample of size n for the specified distribution if cum = FALSE and the corresponding cdf F(x) if cum = TRUE
ORDER_INV(p, k, n, dist, param1, param2, param3) inverse of the kth order statistic from a sample of size n from the specified distribution at p
ORDER_MEAN(k, n, iter, dist, param1, param2, param3) expected value for the kth order statistic x(k) from a sample of size n for the specified distribution.
ORDER2_DIST(x, y, j, k, n, cum, dist, param1, param2, param3)  joint pdf at (x, y) for the jth and kth order statistics when cum = FALSE and the corresponding cdf when cum = TRUE
RANGE_DIST(wj, k, n, cum, dist, param1, param2, param3) pdf at w for the range x(k) – x(j) when cum = FALSE and the corresponding cdf when cum = TRUE
ORDER_SIM(k, n, lab, iter, alpha, dist, param1, param2, param3) column array with the estimated value of the kth order statistic based on iter simulated samples of size n from the distribution specified by dist with the specified parameters; output also contains the standard error of the estimate along with the 1-alpha confidence interval
ORDER_BOOTSTRAP(R1, k, lab, iter, alpha) column array with the estimated value of the kth order statistic based on the sample in R1 using bootstrapping with iter iterations; output also contains the standard error of the estimate along with the 1-alpha confidence interval.
RANGE_SIM(j, k, n, lab, iter, alpha, dist, param1, param2, param3) column array with the estimated value of the range x(k) – x(jbased iter simulated samples of size n from the distribution specified by dist with the specified parameters; the output also contains the standard error of the estimate along with the 1-alpha confidence interval.
MEDIAN_CI(n, lab, alpha, dist, param1, param2, param3) a column array with estimates of the population median, standard error of the median, and 1-alpha confidence interval based on a sample of size n for the specified continuous distribution using the Sample Median Theorem
ORDER_CI(k, n, lab, alpha) column array that estimates the 1-alpha confidence interval around the kth order statistic for samples of size n using a binomial distribution approach.

The default for alpha is .05; the default for iter is 1000.

Order Statistics (Finite Distribution)

OrderDist(x, N, n, k) probability that x is the kth order statistic when a sample of size n is drawn from the population 1, 2, …, N.
MeanOrder(N, n, k) expected value of the kth order statistic when a sample of size n is drawn from the population 1, 2, …, N
RangeDist(x, y, N, n) probability that x and y are the first and last elements, in order, from a sample of size n drawn from the population 1, 2, …, N
MeanRange(N, n) expected value of y – x where x and y are the first and last elements, in order, from a sample of size n drawn from the population 1, 2, …, N
SpreadDist(x, N, n) probability that the largest element minus the smallest element in the sample of size n taken from the population 1, 2, …, N is x

3 thoughts on “Real Statistics Distribution Functions”

  1. Dear Charles.

    why The formula from the real resource pack, is not working, I mean every i use that formula from the resource, my excell cannot detect it and the result become name#?

    Reply
    • After downloading the Real Statistics software, did you follow the installation instructions?
      If you did, please consult the Troubleshooting section on the the webpage from which you downloaded the software. Especially important is the instructions about Trust.
      Charles

      Reply
  2. Dear Charles

    What is the formula that I should use to calculate the 90th and 95th percentile? It is of your help because I have not been able to find it even though I already downloaded the real statistics plugin.

    greetings

    Reply

Leave a Comment