Real Statistics Release 7.0

This release focuses on Bayesian statistical analysis. More on this subject in the next few releases.

The Non-parametric 2 examples workbook has been revised for compatibility with the new release and a new Bayesian examples workbook has been added. These are available for free download at Download Examples Workbooks

Over the course of the next several days, the website will be updated for compatibility with the new release.

If you are getting value from the Real Statistics website or software, I would appreciate your donations to help offset the costs of the website by going to Please Donate.

The following is an overview of the new features in Release 7.0.

Beta Distribution High Density Interval (HDI)

The following array function identifies the HDI for a beta distribution.

BETA_HDI(alpha, beta, lab, p, iter): returns a column array containing the endpoints of the 1–p HDI for the beta distribution Bet(alpha, beta), along with the length of the interval and the pdf values at the end-points.

If lab = TRUE (default FALSE), then a column is appended to the output consisting of labels; iter = # of iterations in the algorithm used (default 40)

Bayesian Grid

A new Creating a Grid for Bayesian Analysis data analysis tool has been added. This type of grid can be useful in Bayesian statistics.

The following array function has also been added to estimate a high density interval (HDI) based on such a grid.

GRID_HDI(R1, R2, lab, p, lprec, hpred): returns a column array containing the endpoints of the approximately 1–p HDI, the length of the interval and the actual value of p used.

If lab = TRUE (default FALSE) then a column of labels is appended to the output.

Since it is unlikely that an exact 1–p HDI will be found, lprec and hpred specify the tolerances for p on the left and right (default .1).

Gibbs Sampler

A new Gibbs Sampler data analysis tool has been added. This method enables you to generate a sample from the joint distribution for two parameters (θ1, θ2), even when you don’t have access to this distribution directly, provided you have access to the conditional distributions θ1|θ2 and θ2|θ1

HDI for a Sample

The HDI for sample data in a column array R1 can be calculated by the following new function.

SAMPLE_HDI(R1, lab, p): returns a column array with the endpoints of the 1–p HDI based on the data in R1; default for p is .05; if lab = TRUE (default FALSE) then a column of labels is appended to the output.

Metropolis Algorithm

The Metropolis algorithm is not yet supported, but the following new function calculates the effective sample size for this algorithm.

ESS(R1, cutoff) = the effective sample size for the sample in the column array R1 produced by the Metropolis algorithm; the ACF values are summed while ACF(k) < cutoff (default .05)

Truncated Normal Distribution

Support for the truncated normal distribution has been added via the following new functions:

TNORM_DIST(x, m, s, cum, a, b) = the pdf of the specified truncated normal distribution f(x) at x when cum = FALSE and the corresponding cumulative distribution function F(x) when cum = TRUE.

TNORM_INV(p, m, s, a, b) = the inverse of the specified truncated normal distribution at p

TNORM_PARAM(m, s, a, b, lab): returns a column array consisting of the mean, median, mode, variance, skewness and kurtosis of the specified truncated normal distribution; if lab = TRUE (default FALSE) then a column of labels is appended to the output.

Here, m = mean parameter, s = standard deviation parameter, a = the left endpoint and b = the right endpoint.

Support for the three-parameter t distribution

The T3_DIST and T3_INV functions have been added with the following properties:

T3_DIST(x, df, mu, sigma, TRUE) = T_DIST((x – mu) / sigma, df, TRUE)

T3_DIST(x, df, mu, sigma, FALSE) = T_DIST((x – mu) / sigma, df, FALSE)/sigma

T3_INV(p, df, mu, sigma) = sigma * T_INV(p, df) + mu

Fitting a Cauchy Distribution

The Cauchy distribution is the three-parameter t distribution where df = 1. Thus, the pdf and cdf at x can be calculated via the formula T3_DIST(x, 1, mu, sigma, TRUE). The inverse function at p can be obtained from the formula T3_INV(p, 1, mu, sigma).

To fit a Cauchy distribution to data in a column array R1 you can use the following new function based on a method of moments-like approach.

CAUCHY_FITM(R1, lab, med, exc): returns an array with the Cauchy distribution parameter values mu, sigma and MLE.

If lab = TRUE, then an extra column of labels is appended to the output (default FALSE). If med = TRUE, then the median of R1 is used to estimate mu; otherwise, the 76% trimmed mean is used (default FALSE). If exc = TRUE, the exclusive IQR is used to estimate sigma; otherwise, the inclusive IQR is used (default FALSE).

We can obtain the maximum likelihood estimates of the Cauchy distribution parameters using the following new function.

CAUCHY_FIT(R1, lab, iter, sguess, mguess): returns an array with the Cauchy distribution parameters mu, sigma, median, 76% trimmed mean, inclusive IQR, inclusive IQR and MLE

If lab = TRUE, then an extra column of labels is appended to the output (default FALSE). iter is the number iterations used in calculating the solution (default 20).

sguess and mguess are the initial guesses used for the sigma and mu parameters. If sguess = 0 or is omitted then the initial guesses are the estimates output by the CAUCHY_FITM function,

Bug Fixes

  • The QSORT, QSORTRows, QSORT2Rows, QSORT2ROWSMixed functions have all been revised to avoid an error which sometimes occurred when the data being sorted was already in sorted order
  • Fixed an error in the Basic Forecasting data analysis tool in the case where the Holt-Winters Additive and Column headings included with data options were selected.