The following is a summary of worksheet functions provided in the Real Statistics Resource Pack that support probability distributions. These include functions that carry out parametric tests.
These functions are organized into the following categories:
- Normal distribution
- t distribution
- Chi-square and F distributions
- Coefficient of variation
- Noncentral distribution functions
- Statistical power functions
- Sample size functions
- Weibull distribution
- Other distributions
- Distribution fitting functions
- Confidence intervals for fitted parameters
- Order statistics for continuous and discrete distributions
- Order statistics for finite distributions
Worksheet functions that support distributions for non-parametric test parameters or pertain to statistical table lookup can be found at Real Statistics Non-parametric Functions.
Normal distribution
NORM_CONF(R1, alpha) | value k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the specified alpha value and the sample data in R1 assuming a normal distribution |
NORM_LOWER(R1, alpha) | value x̄ − k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the specified alpha value and the sample data in R1 assuming a normal distribution |
NORM_UPPER(R1, alpha) | value x̄ + k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the specified alpha value and the sample data in R1 assuming a normal distribution |
HNORM_DIST(x, µ, σ, cum, a, b) | cdf of the half-normal distribution N(µ, σ2, a, b) at x when cum = TRUE, and pdf of the distribution at x when cum = FALSE |
HNORM_INV(p, µ, σ, a, b) | value of the inverse distribution function of the half-normal distribution N(µ, σ2, a, b) at p |
TNORM_DIST(x, µ, σ, cum, a, b) | cdf of the truncated normal distribution N(µ, σ2, a, b) at x when cum = TRUE, and pdf of the distribution at x when cum = FALSE |
TNORM_INV(p, µ, σ, a, b) | value of the inverse distribution function of the truncated normal distribution N(µ, σ2, a, b) at p |
TNORM_PARAM(p, µ, σ, a, b, lab) | array function that returns a column array consisting of the mean, median, mode, variance, skewness, and kurtosis of the truncated normal distribution N(µ, σ2, a, b) |
TOLERANCE_NORM (n, p, alpha, type) | value k such that (x̄ − k, x̄ + k) is the tolerance interval for the normal distribution where n = sample size; p = tolerance (default .9); type = 2 (default) for two-sided interval, type = 1 for a one-side interval using a non-central t distribution and type = 0 for a one-sided interval using the Natrella approach. |
MSSD(R1, lab) | array function that returns a column array with MSSD of R1, the variance of R1, z-statistic and p-value for the MSSD test |
SHAPIRO(R1, roy) | Shapiro-Wilk test statistic W for the data in R1 |
SWTEST(R1, roy, h) | the p-value of the Shapiro-Wilk test on the data in R1 |
SWCoeff(n, k, roy) | kth Shapiro-Wilk coefficient for samples of size n |
SWCoeff(R1, C1, roy) | Shapiro-Wilk coefficient for cell C1 within the ordered sample in R1 |
JARQUE(R1, pop) | Jarque-Barre statistic for the data in R1 |
JBTEST(R1, pop) | p-value of the Jarque-Barre test on the data in R1 |
SKEWTEST(R1, lab, alpha) | array function to test whether the skewness of the data in R1 is zero (consistent with a normal distribution); output contains sample skewness, standard error, test statistic, p-value and 1–alpha confidence interval |
KURTTEST(R1, lab, alpha) | array function to test whether the kurtosis of the data in R1 is zero (consistent with a normal distribution); output contains sample kurtosis, standard error, test statistic, p-value and 1–alpha confidence interval |
SKEWPTEST(R1, lab) | array function to test whether the skewness of the data in R1 is zero (consistent with a normal distribution); output contains population skewness, test statistic, and p-value |
KURTPTEST(R1, lab) | array function to test whether the kurtosis of the data in R1 is zero (consistent with a normal distribution); output contains population kurtosis, test statistic, and p-value |
DAGOSTINO(R1, pop) | d’Agostino-Pearson statistic for the data in R1; if pop = TRUE (default) then the population version is used |
DPTEST(R1, pop) | p-value of the d’Agostino-Pearson test on the data in R1; if pop = TRUE (default) then the population version is used |
In the above functions, when roy = TRUE (default) the Royston algorithm is used, and when roy = FALSE the original Shapiro-Wilk algorithm is used (employing harmonic interpolation unless h = FALSE when linear interpolation is used).
T distribution
CONFIDENCE_T(alpha, s, size) | value k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the specified alpha value, standard dev s and sample size, assuming a t distribution |
T_CONF(R1, alpha) | value k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the stated alpha value and sample data in R1 assuming a t distribution |
T_LOWER(R1, alpha) | value x̄ − k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the stated alpha value and sample data in R1 assuming a t distribution |
T_UPPER(R1, alpha) | value x̄ + k such that (x̄ − k, x̄ + k) is the confidence interval for the population mean based on the stated alpha value and sample data in R1 assuming a t distribution |
DF_POOLED(R1, R2) | degrees of freedom for the two-sample t-test for samples in R1 and R2, especially when the two samples have unequal variances |
STDERR_POOLED(R1, R2, b) | pooled standard error for two-sample t-test for samples in R1 and R2; if b = TRUE assume equal variances, while if b = FALSE assume unequal variances |
STDEV_POOLED(R1, R2) | pooled standard deviation for the two-sample t-test for samples in R1 and R2 when the two samples have equal variances |
VAR_POOLED(R1, R2) | pooled variance for the two-sample t-test for samples in R1 and R2 when the two samples have equal variances |
T_DIST(x, df, cum) | cdf of the t distribution with df degrees of freedom at x when cum = TRUE, and pdf of the distribution at x when cum = FALSE |
T3_DIST(x, df, µ, σ, cum) | cdf of the non-standardized t distribution T(df, µ, σ) at x when cum = TRUE, and pdf of the distribution at x when cum = FALSE |
T3_INV(p, df, µ, σ, cum) | value of the inverse distribution function of the non-standardized t distribution T(df, µ, σ) at p |
T_DIST is like Excel’s T.DIST function except that df doesn’t have to be an integer. In addition, the following functions are available which have a similar advantage over their Excel counterpart: T_DIST_RT, T_DIST.2T, T_INV, T_INV.2T.
The following array functions are used to calculate confidence intervals for Cohen’s effect size. The output takes the form of a column array with the values Cohen’s d, Hedges’ g, and the lower and upper confidence interval limits for d and g. If lab = TRUE (default FALSE) then a column of labels is appended to the output. alpha = significance level (default .05).
The last three arguments of each of these functions are as for the NT_NCP function, except that if iter = 0 then the Hedges and Olkin estimate of the confidence interval is employed (iter0 and prec are not used), while if iter > 0 (default 1000) then the estimate of the confidence interval described above using the noncentrality parameter. Note that there is no Hedges and Olkin estimate for T_EFFECT2P and TT_EFFECT2P.
T_EFFECT1(m, s, n, lab, alpha, iter, iter0, prec) | based on a one-sample t-test for data with mean m, standard deviation s, and sample size n. |
T_EFFECT2(m1, s1, n1, m2, s2, n2, lab, alpha, iter, iter0, prec) | based on a two independent sample t-test with equal variances for sample 1 with mean m1, standard deviation s1, and sample size n1, and sample 2 with mean m2, standard deviation s2, and sample size n2. |
T_EFFECT2P(m1, m2, s1, s2, r, n, lab, alpha, iter, iter0, prec) | based on dav effect size for a paired sample t-test for sample 1 with mean m1, standard deviation s1, and sample size n, and sample 2 with mean m2, standard deviation s2, and sample size n. r = the correlation coefficient between samples 1 and 2. |
T_EFFECT3(m1, s1, n1, m2, s2, n2, lab, alpha, iter, iter0, prec) | based on a two independent sample t-test with unequal variances for sample 1 with mean m1, standard deviation s1, and sample size n1, and sample 2 with mean m2, standard deviation s2, and sample size n2. |
TT_EFFECT1(R1, hyp, lab, alpha, 0) | =T_EFFECT(m, s, n, lab, alpha, 0) where m = AVERAGE(R1) – hyp, s = STEV.S(R1) and n = COUNT(R1). |
TT_EFFECT2(R1, R2, lab, alpha, iter, iter0, prec) | =T_EFFECT2(m1, s1, n1, m2, s2, n2, lab, alpha, iter, iter0, prec) where m1 = AVERAGE(R1), s1 = STEV.S(R1), n1 = COUNT(R1), m2 = AVERAGE(R2), s2 = STEV.S(R2) and n2 = COUNT(R2) |
TT_EFFECT2P(R1, R2, lab, alpha, iter, iter0, prec) | =T_EFFECT2P(m1, m2, s1, s2, r, n, lab, alpha, iter, iter0, prec) where m1 = AVERAGE(R1), s1 = STEV.S(R1), n = COUNT(R1) = COUNT(R2), m2 = AVERAGE(R2), s2 = STEV.S(R2) and r = CORREL(R1, R2). |
TT_EFFECT3(R1, R2, lab, alpha, iter, iter0, prec) | =T_EFFECT3(m1, s1, n1, m2, s2, n2, lab, alpha, iter, iter0, prec) where m1 = AVERAGE(R1), s1 = STEV.S(R1), n1 = COUNT(R1), m2 = AVERAGE(R2), s2 = STEV.S(R2) and n2 = COUNT(R2) |
The following array functions are used to perform the Trimmed t-test and Yuen-Welch tests. Here R1 and R2 are arrays containing the data for two independent samples. If lab = TRUE (default = FALSE) then an extra column is added with labels. tails = the number of tails (1 or 2, default 2). p = trim percentage (default 0.2)
TRIMTEST(R1, R2, lab, p, tails) | outputs a column array with the values: t-stat, df, p-value for the trimmed means t-test |
YUENTEST(R1, R2, lab, p, tails) | outputs a column array with the values: t-stat, df, p-value for Yuen-Welch’s t-test |
There are also the following array formulas for identifying outliers:
GRUBBS(R1, lab, alpha) | outputs a 4 × 1 column array with the following entries: one potential outlier, G-stat, G-crit, and test significance |
ESD(R1, lab, alpha, k) | outputs a 4 × k array where k = max # of potential outliers and each column contains: a potential outlier, G-stat, G-crit, and test significance (based on ESD test); if k is omitted then k = the # of data rows in highlighted range |
OUTLIERS(R1, alpha, k) | outputs a column array with up to k outliers from array R1 (based on the ESD test); if k is omitted then k = the # of rows in highlighted range |
If lab = TRUE (default = FALSE) then an extra column is added with labels.
Chi-square and F distributions
CHISQ_DIST(x, df, cum) | equivalent to CHISQ.DIST(x, df, cum) except that df does not need to be an integer |
CHISQ_INV(p, df) | equivalent to CHISQ.INV(p, df) except that df does not need to be an integer |
F_DIST(x, df1, df2, cum) | equivalent to F.DIST(x, df1, df2, cum) except that df1 and df2 do not need to be integers. |
F_DIST(p, df1, df2) | equivalent to F.INV(p, df1, df2) except that df1 and df2 do not need to be integers. |
The following Real Statistics functions are also similar to their Excel counterpart except that df does not need to be an integer: CHISQ_DIST_RT and CHISQ_INV.RT. The same is true for F_DIST_RT and F_INV_RT.
Coefficient of Variation
CVTEST(R1, lab, alpha) | returns an array with the sample CV, the unbiased CV, standard error, p-value, and the lower and upper bounds of the 1-alpha confidence interval for the one sample CV test based on the data in R1 |
CV2TEST(R1, R2, lab, alpha) | returns an array with the CV for sample 1, the CV for sample 2, the pooled CV, the z-statistic, p-value, and the lower and upper bounds of the 1-alpha confidence interval for the two-sample CV test based on the data in R1 and R2. |
alpha = the significance level (default .05). If lab = TRUE (default FALSE) then a column of labels is appended to the output.
Noncentral distribution functions
NT_DIST (t, df, δ, cum, iter, prec) | value of the noncentral t distribution T(df,δ) at t if cum = TRUE and pdf at t if cum = FALSE |
NCHISQ_DIST (x, df, λ, cum, iter, prec) | value of the noncentral chi-square distribution χ2(df,λ) at x if cum = TRUE and pdf at x if cum = FALSE |
NF_DIST(x, df1, df2, λ, cum, iter, prec) | value of the noncentral F distribution F(df1,df2,λ) at x if cum = TRUE and pdf at x if cum = FALSE |
NT_INV (p, df, δ, iter, iter0) | inverse of the noncentral t distribution T(df,δ) at p |
NCHISQ_INV (p, df, λ, iter, iter0, prec) | inverse of the noncentral chi-square distribution χ2(df,λ) at p |
NF_INV(p, df1, df2, λ, iter, iter0, prec) | inverse of the noncentral F distribution F(df1,df2,λ) at p |
NT_NCP (p, df, x, iter, iter0, prec) | ncp value δ such that NT_DIST(x, df, δ, TRUE, m) = p |
NCHISQ_NCP (p, df, x, iter, iter0, prec) | ncp value λ such that NCHISQ_DIST(x, df, λ, TRUE, m) = p |
NF_NCP(p, df1, df2, x, iter, iter0, prec) | ncp value λ such that NF_DIST(x, df1, df2, λ, TRUE, m) = p |
iter = maximum number of terms in the infinite sum that will be calculated (default 1000), prec = desired level of accuracy for the calculation (default 0.000000001 for the χ2 and F distribution and 0.000000000001 for the t distribution). iter0 = the number of iterations used to calculate the inverse or noncentrality parameter (default 40).
Statistical power functions
NORM1_POWER(d, n, tails, α) | power of a one-sample normal test when d = Cohen’s effect size |
NORM2_POWER(mm, s1, s2, n1, n2, tails, α) | power of a two-sample normal test when mm = |μ2−μ1| and s1 and s2 are the corresponding population standard deviations |
BINOM_POWER(p0, p1, n, tails, α) | power of a one-sample binomial test when p0 = probability of success on a single trial based on the null hypothesis, p1 = expected probability of success on a single trial |
CORREL1_POWER(r0, r1, n, tails, α) | power of a one-sample correlation test using the Fisher transformation when r0 = population correlation (based on the null hypothesis), r1 = effect size (observed correlation) |
T1_POWER(d, n, tails, α, iter, prec) | power of a one-sample t-test when d = Cohen’s effect size |
T2_POWER(d, n1, n2, tails, α, iter, prec) | power of a two-sample t-test when d = Cohen’s effect size; if n2 is omitted or set to 0 then n2 is considered to be equal to n1 |
T1_POWER_Conf(d, n, tails, lab, alpha, iter, prec) | a column array containing the power of a one-sample t-test, plus the lower and upper confidence interval limits of the power, where if lab = TRUE (default FALSE) then a column of labels is appended to the output and all the other arguments are as for T1_POWER |
T2_POWER_Conf(d, n1, n2, tails, lab, alpha, iter, prec) | a column array containing the power of a two independent sample t-test, plus the lower and upper confidence interval limits of the power, where if lab = TRUE (default FALSE) then a column of labels is appended to the output and all the other arguments are as for T2_POWER |
VAR1_POWER(ratio, n, tails, α) | power of a one-sample variance test where ratio = (effect size) |
VAR2_POWER(ratio, n1, n2, tails, α) | power of a two-sample variance test where ratio = (effect size) |
n, n1, n2 = the sample size, tails = # of tails: 1 or 2 (default), α = alpha (default = .05) and m and prec as for the noncentral distribution functions.
Sample size functions
NORM1_SIZE(d, 1−β, tails, α) | minimum sample size required to obtain power of at least 1−β in a one-sample normal test when d = Cohen’s effect size |
NORM2_SIZE(mm, s1, s2, 1−β, tails, α, nratio) | minimum sample size required to obtain power of at least 1−β in a two-sample normal test when mm = |μ2−μ1| and s1 and s2 are the corresponding population standard deviations |
BINOM_SIZE(p0, p1, 1−β, tails, α) | minimum sample size required to obtain power of at least 1−β in a one-sample binomial test when p0 = probability of success on a single trial based on the null hypothesis, p1 = expected probability of success on a single trial |
CORREL1_SIZE(r0, r1, 1−β, tails, α) | minimum sample size required to obtain power of at least 1−β in a one sample correlation test using the Fisher transformation when r0 = population correlation (based on the null-hypothesis), r1 = effect size (observed correlation) |
T1_SIZE(d, 1−β, tails, α, iter, prec) | minimum sample size required to obtain power of at least 1−β in a one-sample t-test when d = Cohen’s effect size |
T2_SIZE(d, 1−β, tails, α, nratio, iter, prec) | minimum sample size required to obtain power of at least 1−β in a two-sample t-test when d = Cohen’s effect size |
VAR1_SIZE(ratio, 1−β, tails, α) | minimum sample size required to obtain power of at least 1−β in a one-sample variance test where ratio = (effect size) |
VAR2_SIZE(ratio, 1−β, tails, α, nratio) | minimum sample size required to obtain power of at least 1−β in a two-sample variance test where ratio = (effect size) |
tails = # of tails: 1 or 2 (default), α = alpha (default = .05), iter and prec as for the noncentral distribution functions, nratio = the size of the second sample divided by the size of the first sample (default = 1) or negative of a specific value for the size of the second sample; default for 1−β = .80.
Weibull distribution functions
WEIBULL_INV(p, α, β) | inverse at p of the Weibull distribution with parameters α and β |
WEIBULL_MRL(x, α, β) | the mean residual life of an element that follows a Weibull distribution with parameters α and β that has not yet failed at time x |
WEIBULL_CMEAN(R1, α, β, ncensor, censor) | expected mean of data in R1 that follows a Weibull distribution with parameters α and β when augmented by ncensor censored elements at time censor. |
WEIBULL_CVAR(R1, α, β, ncensor, censor, iter) | expected variance of data in R1 that follows a Weibull distribution with parameters α and β when augmented by ncensor censored elements at time censor based on a Monte Carlo simulation of iter random draws. |
Other distribution functions
MULTINOMDIST(R1, R2) | value of multinomial distribution where R1 contains the number of successes and R2 contains the corresponding probabilities of success |
UNIFORM_DIST(x, α, β, cum) | pdf value at x for uniform distribution with endpoints α and β when cum = FALSE and corresponding cdf value when cum = TRUE |
UNIFORM_INV(p, α, β) | inverse at p of the uniform distribution with endpoints α and β |
POISSON_INV(p, λ) | inverse at p of the Poisson distribution with mean λ |
NEGBINOM_INV(p, k, pp) | inverse at p of the negative binomial distribution with parameters k and pp |
EXPON_INV(p, lambda) | inverse at p of the exponential distribution with parameter lambda |
HYPGEOM_INV(p, n, k, m) | inverse at p of the hypergeometric distribution with parameters n, k and m |
BETA(x, y) | value of the beta function at x and y |
QDIST(q, k, df) | studentized q cumulative distribution value for q with k independent variables and df degrees of freedom |
QINV(p, k, df, tails) | inverse of the studentized q distribution, i.e. the critical value for the studentized q range; tails = 1 or 2 (default) |
PERT_DIST(x, a, b, c, cum) | pdf/cdf of the PERT distribution at x based on the parameters a (minimum), b (mode) and c (maximum). If cum = TRUE then the cdf is returned, while if cum = FALSE then the pdf is returned. |
PERT_INV(p, a, b, c) | inverse at p of the PERT distribution based on PERT_DIST |
TRIANG_DIST(x, a, b, c, cum) | pdf/cdf of the triangular distribution at x using parameters a (minimum), b (mode), c (maximum). If cum = TRUE then the cdf is returned; otherwise, the pdf is returned. |
TRIANG_INV(p, a, b, c) | inverse at p of the triangular distribution based on TRIANG_DIST |
GUMBEL_DIST(x, mu, beta, cum) | Gumbel distribution at x with parameters mu and beta. If cum = TRUE then cdf is returned; otherwise, pdf is returned. |
GUMBEL_INV(p, mu, beta) | inverse of the Gumbel distribution at p |
LAPLACE_DIST(x, mu, beta, cum) | Laplace distribution at x with parameters mu and beta. If cum = TRUE then cdf is returned; otherwise, pdf is returned. |
LAPLACE_INV(p, mu, beta) | inverse at p of the Laplace distribution |
GEOM_DIST(x, p, cum) | Geometric distribution at x with parameter p. If cum = TRUE then the cdf is returned; otherwise, the pdf is returned. |
GEOM_INV(p, pp) | inverse of the geometric distribution with parameter pp at p |
LOGISTIC_DIST(x, mu, beta, cum) | Logistic distribution at x with parameters mu and beta. If cum = TRUE then the cdf is returned; otherwise, the pdf is returned. |
LOGISTIC_INV(p, mu, beta) | inverse of the Logistic distribution at p |
GEV_DIST(x, mu, sigma, xi, cum) | Generalized Extreme Value distribution at x. If cum = TRUE then the cdf is returned; otherwise, the pdf is returned. |
GEV_INV(p, mu, sigma, xi) | inverse of the Generalized Extreme Value distribution at p |
FRECHET_DIST(p, alpha, beta, gamma, cum) | Frechet distribution at x. If cum = TRUE then the cdf is returned; otherwise, the pdf is returned. |
FRECHET_INV(p, alpha, beta, gamma) | inverse of the Frechet distribution at p |
PARETO_DIST(x, alpha, mn, cum) | Pareto distribution at x with parameters mu and mn. If cum = TRUE then the cdf is returned; otherwise, the pdf is returned. |
PARETO_INV(p, alpha, mn) | inverse at p of the Pareto distribution |
IGAMMA_DIST(x, alpha, beta, cum) | inverse gamma distribution at x with parameters alph, beta. If cum = TRUE, cdf is returned; otherwise, pdf is returned. |
IGAMMA_INV(p, alpha, beta) | inverse of the inverse gamma distribution at p |
ICHISQ_DIST(x, df, cum) | inverse chi-square distribution at x |
ICHISQ_INV(p, df) | inverse at p of the inverse chi-square distribution cdf |
DIRICHLET_DIST(pvector, avector) | Dirichlet distribution at pvector (whose values add up to 1) based on alpha parameters in avector |
DIRICHLET_RAND(avector) | an array pvector consisting of random values for the Dirichlet distribution with alpha values in avector |
The following functions return the mean and variance for a variety of distributions.
MEAN_DIST(dist, param1, param2, param3) | the mean of the distribution dist based on the listed parameters. dist takes the values “norm”, “chisq”, “t”, etc. The values of the parameters depend on the value of dist. |
VAR_DIST(dist, param1, param2, param3) | the variance of the distribution dist based on the listed parameters. dist takes the values “norm”, “chisq”, “t”, etc. The values of the parameters depend on the value of dist. |
GEVSKEW(xi) | the population skewness for the GEV distribution based on the xi parameter |
The following array functions are also supported.
PERM_DIST(n, cum) | returns a column array with the cdf values from 0 to C(n+1,2) of the one-sample permutation distribution when cum = TRUE (default) and the frequency values when cum = FALSE. |
PERM2_DIST(n1, n2, cum) | returns a column array with the cdf values from 0 to n1*n2 of the two-sample permutation distribution when cum = TRUE (default) and the frequency values when cum = FALSE. |
Distribution fitting functions
BETA_FITM(R1, lab, pure) | an array consisting of the estimated beta distribution alpha and beta values based on the method of moments, plus MLE. |
GAMMA_FITM(R1, lab, pure) | an array consisting of the estimated gamma distribution alpha and beta values based on the method of moments, plus MLE. |
UNIFORM_FITM(R1, lab, pure) | an array consisting of the estimated uniform distribution alpha and beta values based on the method of moments, plus MLE. |
TRIANG_FITM(R1, lab, lo, hi, iter, exta, extc, iterb, iterc) | an array consisting of the estimated triangular distribution parameters (a, b, c) based on the method of moments, plus actual and estimated mean, variance, skewness, and MLE. |
PERT_FITM(R1, lab, lo, hi, iter, exta, extc, iterb, iterc) | an array consisting of the estimated PERT distribution parameters (a, b, c) based on the method of moments, plus actual and estimated mean, variance, skewness, and MLE. |
WEIBULL_FITM(R1, lab, pure, iter, bguess) | an array consisting of the estimated Weibull distribution alpha and beta values based on the method of moments, plus MLE. |
GUMBEL_FITM(R1, lab, pure) | an array consisting of the estimated Gumbel distribution mu and beta values based on the method of moments, plus MLE. |
GEV_FITM(R1, lab, xguess, iter, prec, incr, pure) | an array consisting of the estimated GEV distribution mu, sigma, and xi values based on the method of moments, plus MLE. |
LAPLACE_FITM(R1, lab, pure) | an array consisting of the estimated Laplace distribution mu and beta values based on the method of moments, plus MLE. |
LOGNORM_FITM(R1, lab, pure) | an array consisting of the estimated lognormal distribution mu and sigma values based on the method of moments, plus MLE. |
LOGISTIC_FITM(R1, lab, pure) | an array consisting of the estimated Logistic distribution mu and beta values based on the method of moments, plus MLE. |
PARETO_FITM(R1, lab) | an array consisting of the estimated Pareto distribution alpha and mn values based on the method of moments, plus MLE. |
CAUCHY_FITM(R1, lab, med, exc) | an array consisting of the estimated Cauchy distribution mu and sigma values based on a modified method of moments, plus MLE. If med = TRUE, then the median of R1 is used to estimate mu; otherwise, the 76% trimmed mean is used (default FALSE). If exc = TRUE, the exclusive IQR is used to estimate sigma; otherwise, the inclusive IQR is used (default FALSE). |
BETA_FIT(R1, lab, iter, aguess, bguess) | an array consisting of the estimated beta distribution alpha and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE. |
GAMMA_FIT(R1, lab, iter, aguess) | an array consisting of the estimated gamma distribution alpha and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE. |
EXPON_FIT(R1, lab) | an array consisting of the estimated exponential distribution lambda value, plus actual variance (sample and population), estimated variance, and MLE. |
UNIFORM_FIT(R1, lab, iter) | an array consisting of the estimated uniform distribution alpha and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE. |
TRIANG_FIT(R1, lab, lo, hi, iter, exta, extc, iterc) | an array consisting of the estimated triangular distribution parameters (a, b, c) based on the MLE approach, plus actual and estimated mean, variance, skewness, and MLE. |
PERT_FIT(R1, lab, lo, hi, iter, exta, extc, iterb, iterc) |
an array consisting of the estimated PERT distribution parameters (a, b, c) based on the MLE approach, plus actual and estimated mean, variance, skewness, and MLE. |
GUMBEL_FIT(R1, lab, iter, bguess) | an array consisting of the estimated Gumbel distribution mu and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE. |
GEV_FIT(R1, lab, iter, prec, incr, mguess, sguess, xguess) | a 3 × 4 array consisting of the estimated GEV distribution mu, sigma, and xi values and their standard errors based on the MLE approach. |
LAPLACE_FIT(R1, lab) | an array consisting of the estimated Laplace distribution mu and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE. |
GEOM_FIT(R1, lab) | an array consisting of the estimated geometric distribution p parameter value, plus actual and estimated mean and variance, and MLE. |
LOGISTIC_FIT(R1, lab, iter, bguess, mguess) | an array consisting of the estimated Logistic distribution mu and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE. |
NORM_FIT(R1, lab, pure) | an array consisting of the estimated normal distribution mu and sigma values based on the MLE approach, plus actual and estimated variance, and MLE. |
BINOM_FIT(R1, lab, ntrials) | an array consisting of the estimated binomial distribution p parameter based on the MLE approach, plus actual and estimated mean and variance, and MLE. |
POISSON_FIT(R1, lab) | an array consisting of the estimated Poisson distribution lambda value based on the MLE approach, plus sample variance, and MLE. |
PARETO_FIT(R1, lab) | an array consisting of the estimated Pareto distribution alpha and mn values based on the MLE approach, plus actual and estimated mean and variance, and MLE. |
CAUCHY_FIT(R1, lab, iter, sguess, mguess) | an array consisting of the estimated Cauchy distribution mu and sigma values based on the MLE approach, plus actual median, 75% trimmed mean, inclusive and exclusive IQR, and MLE. |
WEIBULL_FIT(R1, lab, iter, bguess, ncensor, censor) | an array consisting of the estimated Weibull distribution alpha and beta values based on the MLE approach, plus actual and estimated mean and variance, and MLE; ncensor = the number of censored data elements (default 0) and censor = time of censoring |
WEIBULL_FITR(R1, lab, benard) | an array consisting of the estimated Weibull distribution alpha and beta values based on regression, plus actual and estimated mean and variance, and MLE and R-square; if benard = TRUE (default) then the Benard approximation is used. |
CAUCHY_FITX(R1, lab) | an array consisting of the estimated Cauchy distribution mu and sigma values based on the weighted ordered statistics approach, plus MLE. |
POLYGAMMA(z, k) | digamma function at z if k = 0 (default), and trigamma function at z if k = 1. |
KDE(R1, R2, h, ktype) | array function that returns a column array with f(x) pdf values corresponding to the x values in column array R1 based on the sample data in column array R2, the bandwidth h, and the kernel specified by ktype. |
If lab = TRUE, then an extra column of labels is appended to the output (default is FALSE). If pure = TRUE, then the pure method of moments is used (default is FALSE). MLE is the likelihood estimate for the parameters that are output.
iter is the number of iterations used in calculating the solution; for UNIFORM_FIT if iter = 0 then alpha and beta are set to the minimum and maximum data values (biased MLE). aguess is the initial guess for alpha (use 0 for default initial guess), bguess is the initial guess for beta (use 0 for default initial guess), sguess is the initial guess for sigma (use 0 for default initial guess), xguess is the initial guess for xi, and mguess us the initial guess for mu.
If the ntrials argument in BINOM_FIT is omitted then the first element in R1 serves as ntrials.
The parameters for TRIANG_FIT, TRIANG_FITM, PERT_FIT, and PERT_FITM are a little different from the others. See the links for more details. Related to these functions are the following two worksheet functions:
TRIANG_MLE(R1, a, b, c) | LL for a triangular distribution with parameters a, b, and c which fits the data in array R1. |
PERT_MLE(R1, a, b, c) | LL for a PERT distribution with parameters a, b, and c which fits the data in array R1. |
Confidence intervals for fitted parameters
BETA_CONF(alfa, beta, n, lab, ttype, iter, alpha, fiter) | an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the alfa and beta parameters of the beta distribution |
EXPON_CONF(lambda, n, lab, iter, alpha) | an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the lambda parameter of the exponential distribution |
GAMMA_CONF(alfa, beta, n, lab, ttype, iter, alpha, fiter) | an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the alfa and beta parameters of the gamma distribution |
GEOM_CONF(p, n, lab, ttype, iter, alpha) | an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the p parameter of the geometric distribution |
GUMBEL_CONF(mu, beta, n, lab, ttype, iter, alpha, fiter) | an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the mu and beta parameters of the Gumbel distribution |
LAPLACE_CONF(mu, beta, n, lab, ttype, iter, alpha) | an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the mu and beta parameters of the Laplace distribution |
LOGISTIC_CONF(mu, beta, n, lab, ttype, iter, alpha, fiter) | an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the mu and beta parameters of the Logistic distribution |
PARETO_CONF(alfa, mn, n, lab, ttype, iter, alpha) | an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the alfa and mn parameters of the Pareto distribution |
LOGNORM_CONF(mu, sigma, n, lab, ttype, iter, alpha) | an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the mu and sigma parameters of the log-normal distribution |
NORMAL_CONF(mu, sigma, n, lab, iter, alpha) | an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the mu and sigma parameters of the normal distribution |
UNIFORM_CONF(alfa, beta, n, lab, ttype, iter, alpha, fiter) | an array with the parameter value, estimated standard error, and lower and upper limit of a 1-alpha confidence interval for the alfa and beta parameters of the uniform distribution |
n = sample size. iter = # of bootstrap iterations (default 1000). ttype = 0 (MLE, default), 1 (method of moments), 2 (pure method of moments), and 3 (regression). alpha defaults to .05. Each parameter is represented by one column with 4 rows in the output; if lab = TRUE a column of row labels and a row of column labels are appended to the results (default FALSE). fiter is the # of iterations required to fit data to the specific distribution; e.g. GUMBEL_FIT(R1, lab, fiter).
Bayesian Statistics
SAMPLE_HDI(R1, lab, p) | array function which returns a column array with the endpoints of the 1–p HDI based on the data in R1 |
GRID_HDI(R1, R2, lab, p, lprec, uprec) | array function which returns a column array with the following entries: endpoints of the HDI, length of the HDI, and the actual value for 1–p based on the HDI. lprec and hpred specify the tolerances for p on the left and right (default .1). |
BETA_HDI(p, alpha, beta, lab, iter) |
array function that returns a column array with the following entries: endpoints of the 1– p HDI, length of the HDI, and the pdf values at the two endpoints. iter = the number of iterations (default 40) |
ESS(R1, cutoff) | the effective sample size for the sample in R1 produced by the Metropolis algorithm; the ACF values are summed while ACF(k) < cutoff (default .05) |
NORM_GAMMA(μ0, φ0, n0, m, φ, n, lab) | a column array with the posterior values of the mean, variance, and size based on the NormGamma prior with parameters μ0, φ0, n0 and sample data of size n with mean m and variance φ. |
The default for p is .05; if lab = TRUE (default FALSE) then a column of labels is appended to the output.
Order Statistics (Continuous or Discrete Distribution)
ORDER_DIST(x, k, n, cum, dist, param1, param2, param3) | pdf f(x) for the kth order statistic x(k) from a sample of size n for the specified distribution if cum = FALSE and the corresponding cdf F(x) if cum = TRUE |
ORDER_INV(p, k, n, dist, param1, param2, param3) | inverse of the kth order statistic from a sample of size n from the specified distribution at p |
ORDER_MEAN(k, n, iter, dist, param1, param2, param3) | expected value for the kth order statistic x(k) from a sample of size n for the specified distribution. |
ORDER_VAR(k, n, iter, dist, param1, param2, param3) | variance for the kth order statistic x(k) from a sample of size n for the specified distribution. |
ORDER2_DIST(x, y, j, k, n, cum, dist, param1, param2, param3) | joint pdf at (x, y) for the jth and kth order statistics when cum = FALSE and the corresponding cdf when cum = TRUE |
RANGE_DIST(w, j, k, n, cum, dist, param1, param2, param3) | pdf at w for the range x(k) – x(j) when cum = FALSE and the corresponding cdf when cum = TRUE |
ORDER_SIM(k, n, lab, iter, alpha, dist, param1, param2, param3) | column array with the estimated value of the kth order statistic based on iter simulated samples of size n from the distribution specified by dist with the specified parameters; output also contains the standard error of the estimate along with the 1-alpha confidence interval |
ORDER_BOOTSTRAP(R1, k, lab, iter, alpha) | column array with the estimated value of the kth order statistic based on the sample in R1 using bootstrapping with iter iterations; output also contains the standard error of the estimate along with the 1-alpha confidence interval. |
RANGE_SIM(j, k, n, lab, iter, alpha, dist, param1, param2, param3) | column array with the estimated value of the range x(k) – x(j) based iter simulated samples of size n from the distribution specified by dist with the specified parameters; the output also contains the standard error of the estimate along with the 1-alpha confidence interval. |
MEDIAN_CI(n, lab, alpha, dist, param1, param2, param3) | a column array with estimates of the population median, standard error of the median, and 1-alpha confidence interval based on a sample of size n for the specified continuous distribution using the Sample Median Theorem |
ORDER_CI(k, n, lab, alpha) | column array that estimates the 1-alpha confidence interval around the kth order statistic for samples of size n using a binomial distribution approach. |
The default for alpha is .05; the default for iter is 1000.
Order Statistics (Finite Distribution)
OrderDist(x, N, n, k) | probability that x is the kth order statistic when a sample of size n is drawn from the population 1, 2, …, N. |
MeanOrder(N, n, k) | expected value of the kth order statistic when a sample of size n is drawn from the population 1, 2, …, N |
RangeDist(x, y, N, n) | probability that x and y are the first and last elements, in order, from a sample of size n drawn from the population 1, 2, …, N |
MeanRange(N, n) | expected value of y – x where x and y are the first and last elements, in order, from a sample of size n drawn from the population 1, 2, …, N |
SpreadDist(x, N, n) | probability that the largest element minus the smallest element in the sample of size n taken from the population 1, 2, …, N is x |
Dear Charles.
why The formula from the real resource pack, is not working, I mean every i use that formula from the resource, my excell cannot detect it and the result become name#?
After downloading the Real Statistics software, did you follow the installation instructions?
If you did, please consult the Troubleshooting section on the the webpage from which you downloaded the software. Especially important is the instructions about Trust.
Charles
Dear Charles
What is the formula that I should use to calculate the 90th and 95th percentile? It is of your help because I have not been able to find it even though I already downloaded the real statistics plugin.
greetings