Release 3.8 of the Real Statistics Resource Pack

I am pleased to announce Release 3.8 of the Real Statistics Resource Pack. The new release is now available for free download (Download Resource Pack) for Excel 2010 and 2013 environments. The version for Excel 2007 will be available on 14 April 2015.

The Real Statistics Examples Workbook and Real Statistics Multivariate Examples Workbook have both been updated and can be downloaded for free.

The Real Statistics website is in the process of being updated to reflect the new features. These changes will be made in the next several days.

The release contains the following new features:

Enhancements to Cohen’s Kappa and Weighted Kappa

In addition to calculating the value of kappa, the WKAPPA function and the equivalent functionality provided by the Reliability data analysis tool now calculates the standard error and confidence intervals for kappa.

Instead of providing an explicit weights array, you can now optionally specify that the weights are linear, quadratic, etc. In fact, if you specify a value of r for the weights, this is  equivalent a weights range containing zeros on the main diagonal and values (|i – j|)r in the ith row and jth column when i ≠ j.

Exponential Regression

On the webpage Exponential Regression we review Excel’s approach to exponential regression using linear regression. The Real Statistics Resource Pack now provides a more accurate non-linear exponential regression model using the following array functions where R1 contains a column range with the values for x and R2 contains a column range with the corresponding values for y:

ExpCoeff(R1, R2, iter): a 2 × 2 range whose first column contains the coefficients α and β for the regression and the second column contains the corresponding standard errors for these coefficients.

ExpPred(R, R1, R2, iter): an m × 1 column range with the values predicted by the exponential model for R1 and R2 based on the data in the m × 1 column vector of x values.

ExpPredC(R, α, β): an m × 1 column range with the values predicted by the exponential model with coefficients α and β based on the data in the m × 1 column vector of x values.

Here iter = the number of iterations (default 20). The last two functions can also be used as ordinary (i.e. non-array) functions of form

ExpPred(x, R1, R2, iter) = value predicted by the exponential model for x based on the data in R1 and R2

ExpPredC(xα, β) = value predicted by an exponential model with coefficients α and β for x

Schur’s Factorization

A Schur’s factorization (or Schur’s decomposition) of a square matrix A consists of an orthogonal matrix Q and an upper triangular matrix T such that A = QTQT. The following new functions have been added:

SCHUR(R1, iter, order): returns matrices Q and T such that A = QTQT is a Schur’s factorization of A

SCHURQ(R1, iter, order): returns only matrix Q of the Schur’s factorization of A

The algorithm used is iterative with iter iterations (default 100). The diagonal of T contains the eigenvalues of A. If order = TRUE (default) then the eigenvalues are arranged in descending order based on their absolute values; if order = FALSE then the eigenvalues are in descending order.

Eigenvalues and eigenvectors for non-symmetric matrices

The eVALUES(R1, iter, order) and eVECTORS(R1, iter, order) functions only calculate real eigenvalues (and not eigenvalues that are imaginary numbers). For statistical purposes this is not a problem since we only need to calculate eigenvalues for symmetric matrices, and all the eigenvalues (and eigenvectors) of such matrices are real.

For non-symmetric matrices it is possible that some or all eigenvalues are imaginary numbers, in which case the corresponding values calculated by eVALUES and eVECTORS will be bogus. It is easy to identify such values since if λ is a true eigenvalue of A then det(A- λI) = 0.

The eigenvalues calculated by eVALUES are presented in a row. Below each eigenvalue λ is the value of det(A−λI). If A is a symmetric matrix and the value of det(A−λI) for some eigenvalue λ is not sufficiently close to zero then you should increase the number of iterations by increasing the value of the iter argument in eVALUES until you obtain a value close enough to zero.

This is not the case for non-symmetric matrices, since increasing the number of iterations won’t change an imaginary eigenvalue into a real one.

Whereas the eVALUES and eVECTORS functions will produce all real eigenvalues for any square matrix, eVECTORS will only reliably calculate eigenvectors for symmetric matrices. For non-symmetric matrices even eigenvectors corresponding to real eigenvalues may not be correct. for this reason, you should use the following new function with non-symmetric matrices:

eVECT(R1, iter, order): returns an n+3 × n range, where n = the number of rows/columns in the square range R1. The first row of the output consists of the real eigenvalues of the square matrix A corresponding to the data in R1. Below each eigenvalue λ in the first row is a unit n × 1 eigenvector corresponding to λ. In the second-to-last row of the output are the values det(A−λI). In the last row of the output, below each eigenvalue λ and eigenvector X is the value max {bi: i = 1 to n} where B = AX− λX.

The iter and order arguments are as for the SCHUR function described above. All the values in the second-to-last row of the output which correspond to real eigenvalues should be close to zero. You may need to increase the value of iter to find all the real eigenvalues, but values that don’t get close to zero correspond to imaginary eigenvalues. Since B = AX− λX = O for an eigenvalue λ with corresponding eigenvector X, values in the last row that correspond to real eigenvalues should be close to zero (if not then you may need to increase the value of iter).

Change to the output of the eVECTORS function

The format of the output of the eVECTORS function has been revised. The new output format is identical to that described for eVECT above. The eVECTORS function should only be used for symmetric matrices.

New Spectral Decomposition function

A spectral decomposition of a square matrix A takes the form A = UDUT where U is an orthogonal matrix and D is a diagonal matrix. All symmetric matrices have a spectral decomposition. The following new function has been added:

SPECTRAL(R1, iter): returns a 2n × n range whose top half is the matrix U and whose lower half is the matrix D in the spectral decomposition of A where A is the matrix of values in range R1.

Here iter is the number of iterations in the algorithm used to compute the spectral decomposition (default 100).

New SVD functions

A singular vector decomposition (SVD) of a matrix A takes the form A = UDVT where U and V are orthogonal matrices and D is a diagonal matrix. The following new functions have been added.

SVD_U(R1, iter) = U matrix of the SVD for the matrix A corresponding to range R1

SVD_D(R1, iter) = D matrix of the SVD for the matrix A corresponding to range R1

SVD_V(R1, iter) = V matrix of the SVD for the matrix A corresponding to range R1

Here iter is the number of iterations in the algorithm used to compute the SVD (default 100). The current implementation of these functions only supports square matrices.

Enhancements to the Matrix data analysis tool

The following capabilities have been added to the Matrix data analysis tool

  • Eigenvalues/vectors for non-symmetric matrices (uses eVECT instead of eVECTORS function)
  • Schur factorization
  • Singular vector decomposition (SVD) − only supports square matrices
  • Spectral decomposition

Other new functions

SUBRANGE(R1, row1, col1, nrows, ncols): an array function which outputs an nrows × ncols subrange of R1 starting at the cell in the row1th row and col1th column of the m × n range R1.

The default value for row1 and col1 are 1. The default value for nrows is m – row1 + 1 and the default value for ncols is n – col1 + 1.

FIRSTBLANK(R1) = the address (e.g. “B7”) of the first empty cell in range R1. If R1 contains no empty cells then the value “None” is returned.

FIRSTNonNum(R1) = the address (e.g. “B7”) of the first empty or non-numeric cell in range R1. If R1 contains no non-numeric cells then the value “None” is returned.

REV(R1): array function that reverses the elements in R1; resulting range should have the same size and shape as R1. Available with Release 3.8.1 (see below).

Bug fixes

There is a bug in the ties correction for the Mann-Whitney test that is fixed in Release 3.8.1 of the Real Statistics Resource Pack. This is the version of the Real Statistics Resource Pack that was released for the Excel 2013 environment today. The version for Excel 2010 does not have this bug fix yet; this will be available by 14 April 2015.

Release 3.8 (i.e. the release for Excel 2010) also has an error in the calculation of the last row of the output from the eVECTORS function. This will be corrected in Release 3.8.1.