Real Statistics Release 6.0

I am pleased to announce Release 6.0 of the Real Statistics Resource Pack. The new release is now available for free download at Download Resource Pack for Excel 2010, 2013, 2016 users as well as for Excel 2013, 2019 and 365 users in both Windows and Mac environments.

In a few days, a new release, Release 5.10, for Excel 2007 and 2011 (Mac) users will also be available.

All the examples workbooks have been updated for compatibility with the new release. In addition, there is a new examples workbook called Goodness of Fit, which includes the tests for normality.

Over the course of the next several days, the website will be updated for compatibility with the new releases.

If you are getting value from the Real Statistics website or software, I would appreciate your donations to help offset the costs of the website by going to Please Donate.

The following is a summary of the new features in Release 6.0.

Use of Excel 2010 functions

Versions of Real Statistics for Excel 2010, 2013, 2016, 2019 and 365 now output Excel 2010 functions instead of the corresponding Excel 2007 functions. E.g. T.DIST is output instead of TDIST. Real Statistics for Excel 2007 and 2011 environments will continue to use the older (compatibility) versions of these functions.

Multinomial Logistic Regression Enhancement

The output from the MLogitRSquare function now includes the number of iterations completed. This is important when the Multiple Logistic Regression data analysis tool is forced to terminate before the maximum number of specified iterations is reached. Previously an error was generated. (Thanks to Dillon for identifying the problem that inspired this enhancement)

D’Agostino-Pearson Test for Normality

The following new array functions have been added:

SKEWPTEST(R1, lab): population version of the skewness test; outputs a column array with the skewness of the data in R1, the test statistic and p-value

KURTPTEST(R1, lab): population version of the kurtosis test; outputs a column array with the kurtosis of the data in R1, the test statistic and p-value

If lab = TRUE (default FALSE), then an extra column of labels is added to the output.

These function provide the population version of the existing SKEWTEST and KURTTEST functions. The following existing functions that are used to perform the D’Agostino-Pearson test have been enhanced to provide both the sample and population versions of the test (with the population version used as the default option):

DAGOSTINO(R1, pop) = chi-square statistic for the D’Agostino-Pearson test

DPTEST(R1, pop) = p-value of the D’Agostino-Pearson test

If pop = TRUE (default) then the population version of the test is used; otherwise the sample version of the test is used.

The population version of the D’Agostino-Pearson test is now used in the Descriptive Statistics and Normality data analysis tool (when the Shapiro-Wilk option is chosen). (Thanks to Andrew for pointing out that there is an alternative way of calculating the D’Agostino-Pearson test).

Tolerance Interval

The following function has been added to define a 1 – α tolerance interval (m-ks, m+ks).

TOLERANCE_NORM(n, p, α, type) = k value for the p% tolerance interval based on a sample of size n and confidence level of  1 – α. If type = 2 (default) then k for the two-sided tolerance interval is returned, while if type = 1 then the one-sided version is returned using the noncentral t distribution and if type = 0 then the Natrella one-sided version is used. The default for p is .90 and the default for α is .05.

The tolerance interval is based on sample data of size n taken from a normally distributed population. In particular, we want to find an interval (m-ks, m+ks) that contains p% of the population with 100(1–α)% confidence, where m is the sample mean and s is the sample standard deviation. (Thanks to Troy for requesting this enhancement)

Random Numbers with a Seed

The existing Real Statistics random number generator function RANDOM has been enhanced so that a seed can be used. The new version of this function now takes the following form:

RANDOM(a, b, FALSE, seed) = random number between a and b; i.e a non-volatile version of a + (b − a) * RAND()

RANDOM(a, b, TRUE, seed) = random integer between a and b, inclusive; i.e. a non-volatile version of RANDBETWEEN(a, b)

If a is omitted it defaults to 0, if b is omitted it defaults to 1 and if the third argument is omitted it defaults to FALSE.

If seed ≤ 0 or omitted then no seed is used, while if it is a positive value, then this value is used as a seed. A seed can be used to generate a repeatable sequence of pseudo-random values.

Two Factor ANOVA without Replication Follow-up

A new Two Factor ANOVA without Replication Follow-up data analysis tool has been added to the Anova tab that provides similar functionality to the existing Two Factor ANOVA with Replication Follow-up data analysis tool, but in support of post-hoc tests after a two-way ANOVA without replications.

Stepwise Regression

If on the Multiple Linear Regression dialog box both the Stepwise Regression option is selected and the Include constant term (intercept) option is not selected, then stepwise regression is now performed based on regression through the origin.

In addition, the existing RegStepwise, RegCoeffP and RegRank functions now take a fourth parameter which when TRUE (default) indicates that the regression has a constant term, while when FALSE indicates that the regression is through the origin. (Thanks to Keith for requesting this enhancement)

Cliff’s Delta

The following function has been added to calculate Cliff’s Delta (non-parametric) effect size for two samples. (Thanks to Erik for requesting this enhancement)

CLIFF_DELTA(R1, R2) = Cliff’s delta for the samples in R1 and R2

Mann-Kendall Test

The following new function implements the Mann-Kendall Test for determining whether a trend exists in a time series.

MK_TEST(R1, lab, tails, alpha): returns the MK test statistic, standard error, z-stat, p-value and trend (yes/no, upward/downward) for the time series in R1.

If lab = TRUE then an extra column of labels is added to the output (default FALSE). tails = 1 or 2 (default) and alpha = significance level (default .05). (Thanks to Sylvester for requesting this enhancement)

Sen’s Slope

The following function calculates Sen’s non-parametric measure of the slope of a time series.

SEN_SLOPE(R1, lab, alpha): returns the slope for the time series in R1 and the lower and upper bounds on the  1 – α confidence interval for this slope

If lab = TRUE then an extra column of labels is added to the output (default FALSE). alpha = significance level (default .05).  (Thanks to Sylvester for requesting this enhancement)

Seasonal ARIMA (SARIMA)

A new SARIMA data analysis tool has been added that creates a Seasonal ARIMA model for a time series and produces a forecast based on that model. (Thanks to Triston for requesting this enhancement)

Granger Causality Test

The following functions have been added in support of the Granger Causality Test that time series x Granger-causes time series y, where Rx is a column array with the data for time series x and Ry is a column array with the data for time series y:

GRANGER(Rx, Ry, lags) = the F statistic of the test

GRANGER_TEST(Rx, Ry, lags) = p-value of the test

Engle-Granger Test

The following function have been added to support the Engle-Granger two-step approach to testing two time series for cointegration where Rx and Ry are column arrays containing the two time series being tested.

EGTEST(Rx, Ry, lab, lag, criteria, trend, alpha): outputs a column array with the values tau-stat, tau-critical, cointegrated (yes/no), lags, p-value

Here lab, lag, criteria and alpha are as for the ADFTEST function (for the Augmented Dickey-Fuller Test). tr = TRUE if there is a trend (default FALSE).

The Engle-Granger test is equivalent to the ADF test on the residuals of the OLS regression of the time series in Ry on the time series in Rx, but using a different table of critical values. This table of critical values is used in the following two new functions:

EGCRIT(n, alpha, tr) = critical value of the Engle-Granger test for two time series of length n for the significance level alpha (between .01 and .10, default .05) based on either a trend (when tr = TRUE) or no trend (when tr = FALSE, default)

EGPROB(t, n, tr) = estimated p-value of the Engle-Granger test for two time series of length n when the test statistic is t, where tr is as for EGCRIT.

In addition to these new functions, the Cointegration data analysis test has also been added that supports the Engle-Granger test.

ADF Test Enhancements

The ADFCRIT and ADFPROB functions have been enhanced for slightly greater accuracy. The ADFTEST function, which performs the Augment Dickey-Fuller test now uses these functions to improve its accuracy.

Prime Numbers

The following functions concerning prime numbers have been added. Here, n is any positive integer up to 231-1 = 2,147,483,647.

PrimeCount(n, start) = the number of prime numbers between start and n (inclusive)

IsPrime(n) = TRUE if n is a prime number

NextPrime(n) = next prime number after n

PriorPrime(n) = largest prime number smaller than n

NthPrime(n) = the nth prime number

PrimeList(start, size) = a column array with size number of prime numbers starting with the first prime ≥ start

Factors(n) = a column array containing the prime factors of n (with repetitions if necessary)

While these functions are not needed for statistical analysis, they can be useful in other contexts.

Bitwise Functions

The BIT_AND, BIT_OR, BIT_XOR, BIT_LSHIFT and BIT_RSHIFT functions have been added with functionality similar to the standard Excel functions  BIT_AND, BIT_OR, BIT_XOR, BIT_LSHIFT and BIT_RSHIFT that were introduced with Excel 2013. These new Real Statistics functions may be useful to Excel 2010 users.

Bug Fixes

  • Fixed a bug in the One Sample Runs Test data analysis tool and RUNSTEST function which presents itself in some rare circumstances when numbers are treated as characters. (Thanks to Christian for identifying this error)
  • Fixed a bug in the Latin Squares Design data analysis tool when there are replications in calculating the value of SS_Replications. (Thanks to Karen for identifying this error)
  • Fixed a bug in the calculation of the p-value for the coefficients in Poisson regression. In fact, a z-statistic is now used instead of the Wald statistic in calculating the p-value. This change is made to the Poisson Regression data analysis tool and the PoissonCoeff function. (Thanks to GAD for helping to identify this bug)
  • Fixed a bug in the Multiple ANOVA option of the MANOVA data analysis tool which neglected to calculate a Bonferroni correction (Thanks to Joel for identifying this error)
  • Fixed a bug in the ARIMA data analysis tool that occurs in earlier versions of Excel, but not in Excel 2016, when the time series data values are large. If a user encounters this type of problem in the future, the user can scale the data values down by dividing all values by a large number (e.g. 10,000). (Thanks to Kevin for helping to bring this problem to my attention)

4 thoughts on “Real Statistics Release 6.0”

  1. I thought I had correctly installed version 6.02, (under Excel 2010), but the VERS() function is reporting”5.9 Excel 2013/2016/2019″. What might be going wrong?

    • Jonathan,
      Press Alt-TI and uncheck the file containing the 5.9 version. Then click on the Browse button to locate the file where you stored the 6.02 version and make sure that there is a check mark next to this file in the Add-ins list.
      If Excel is still confused, then I suggest that first you rename the file that contains the 6.0.2 version (e.g. RealStats6.xlam). Now the above approach should work since Excel won’t get confused between the two versions (since they have different names). Real Statistics will work file even when you change the name of the file.
      Charles

      • I had old versions saved in the same directory with altered extension names (“.-xlam”). I discovered that this was not enough to stop them showing up in the Projects Explorer.
        I have now deleted those files completely and finally succeeded in getting the new version to load:
        vers()=6.0.2 Excel 2010/2013/2016/2019/365 Windows
        Thanks!

Comments are closed.