Release 4.5 Announcement

I am pleased to announce Release 4.5 of the Real Statistics Resource Pack. The new release is now available for free download (Download Resource Pack) for Excel 2007, 2010, 2013 and 2016 (Windows version) environments.

The spreadsheets for all the examples used on the Real Statistics website, including those related to the new Release 4.5 features) are available for free download (Download Examples Workbooks). These are contained in three Excel files (i.e. workbooks): Examples Workbook Part 1, Examples Workbook Part 2 and Multivariate Examples. See Workbook Examples for a description of which examples are contained in which files.

The Real Statistics website is in the process of being updated to reflect the new features. These changes will be made over the next several days.

Release 4.5 contains the following new features:

Time series analysis

In this release we initiate support for time series analysis. This work will continue in the next release as well. New functions that are available in this release are:

ACF(R1, k) – autocorrelation function of order k for the data in the column range R1

ACVF(R1, k) – autocovariance function of order k for the data in the column range R1

PACF(R1, k) – partial autocorrelation function of order k for the data in column range R1

ACOV(R1, k) – autocovariance matrix for the data in column range R1

ACORR(R1, k) – autocorrelation matrix for the data in column range R1

Bland-Altman

Bland-Altman is a method for comparing two measurements of the same variable. A Bland-Altman data analysis tool is now available, which can be accessed via the Reliability menu option.

Correlation Exact Test

The Correlation Exact Test can be used in place of the Fisher transformation approach to testing the value of the correlation coefficient. The following functions have been added:

CORRDIST(r, n, rho) – correlation distribution function (cdf) for a sample of size n with sample correlation r and population correlation rho

CORRINV(alpha, n, rho) – inverse of the correlation distribution function

CORRETEST(R1, R2, rho, tails) – p-value for the correlation exact test for the data in ranges R1 and R2, with population correlation rho and tails = 1 or 2

CORRELETEST(R1, R2, rho, lab, alpha, tails) – array function similar to CORRETEST, except that a 1-alpha confidence interval is also output

Thanks to António Teixeira who suggested this topic and contributed greatly to its development.

Skewness and Kurtosis Testing

The following array functions test whether sample data comes from a distribution which has zero skewness and/or kurtosis.

SKEWTEST(R1, lab, alpha) – array function which tests whether the skewness of the sample data in range R1 is zero (consistent with a normal distribution). The output consists of sample skewness, standard error, test statistic, p-value and 1-alpha confidence interval

KURTTEST(R1, lab, alpha) – array function which tests whether the kurtosis of the sample data in range R1 is zero (consistent with a normal distribution). The output consists of sample kurtosis, standard error, test statistic, p-value and 1-alpha confidence interval

D’Agostino-Pearson Test for Normality

The  D’Agostino-Pearson test is yet another test to determine whether data comes from a population which is normally distributed. Generally, Shapiro-Wilk will give more accurate results, but this test can be useful where the sample data has a number of repetitions.

DAGOSTINO(R1) – the test statistic s2 + k2 where s is the test statistic for skewness testing and k is the test statistic for kurtosis testing (described above).

DPTEST(R1) – p-value for the D’Agostino-Pearson test.  When the data comes from a population with a normal distribution, the D’Agostino-Pearson test statistic has a chi-square distribution with 2 degrees of freedom.

Population skewness and kurtosis functions

The follow functions have been added:

SKEWP(R1) – skewness of the distribution for the population in range R1. This is equivalent to the Excel 2013/2016 function SKEW.P, and can be useful to users with prior versions of Excel.

KURTP(R1, excess) – kurtosis of the distribution for the population in range R1. If excess = TRUE then 3 is subtracted from the result (the usual approach so that a normal distribution has kurtosis of zero).

Adjusted Correlation Coefficient and Coefficient of Determination

The following function output the usual adjusted R2 value:

RSQ_ADJ(r, n) – adjusted R2 value = 1-(1-r2)(n-1)/(n-2)  where r = the sample correlation and n = the size of the sample

RSQ_ADJ(R1, R2) – adjusted R2 value for the data in ranges R1 and R2

The following function outputs a relatively unbiased value for the population correlation coefficient. Note that this value is not the square root of the adjusted R2 value described above, and provides a better estimate of the population correlation coefficient.

CORREL_ADJ(r, n) – estimated population correlation coefficient = r[1 +(1 –r2) /2(n -3)]  where r = the sample correlation and n = the size of the sample

CORREL_ADJ(R1, R2) – estimated population correlation coefficient for the data in ranges R1 and R2

Thanks to António Teixeira who suggested this topic.

Matrices raised to a Power

For any square matrix A, A0 = the identity matrix and An+1 = AnA. The following function calculates An where A is the matrix contained in the k × k range R1.

MPOWER(R1, n) = An where A consists of the data in range R1.

We use this function with Markov chains.

Bug Fixes

The results for the Scheirer Ray Hare Test data analysis tool had an error, which produced incorrect results. This has now been corrected. Thanks for Kevin Bluxome for identifying this error.

The values for the adjusted means in the ANCOVA data analysis tool were in error. This has now been corrected. Thanks to Bill G. for identifying this error.

2 thoughts on “Release 4.5 Announcement”

  1. I downloaded RealStats pack and I installed it on Execl 2016. When I tried WRegression function, the R, R-square, R-square Adjusted, F-stats values, and few others are different compared to R using the same data set and using lm() in R with weights parameter set. Why is that?

Comments are closed.