Real Statistics Release 9.2

I am pleased to announce Release 9.2 of the Real Statistics Resource Pack. The new release is now available for free download at Download Resource Pack for Excel 2010, 2013, 2016, 2019, 2021, and 365 Windows and Mac environments. 

I want to thank everyone who has made suggestions or has identified errors in the website or software. Your help has improved the utility and accuracy of Real Statistics.

I also want to express my appreciation to all of you who have donated to Real Statistics. These donations help to offset the costs of maintaining the website. If you are getting value from the Real Statistics website or software, I would appreciate your donation by going to Please Donate.

The following is an overview of the new features in Release 9.2. 

Tobit Regression

The Tobit regression model (aka the censored regression model) is designed to model the linear relationship between independent variables and a censored dependent variable. A new Tobit Regression data analysis tool has been added to support such regressions. In addition, the following new worksheet functions are now available:

TobitCoeff(Rx, Ry, iter, prec, guess): returns an array with two columns consisting of the estimated  Tobit regression coefficients and their standard errors for the X data in Rx and Y data in Ry using Newton’s method to maximize the log-likelihood function.

TobitLL(Rx, Ry, Rc) = the maximum log-likelihood for the X data in Rx and Y data in Ry based on the Tobit regression coefficients in Rc.

TobitPred(Rx0, Rc): returns an array with two columns consisting of the estimated mean uncensored and censored values for the X data in Rx0 based on the Tobit regression coefficients in Rc.

Hampel outliers

This release adds the Hampel outlier detection and filtering capability. This capability detects outliers in a time series and replaces such outliers by an appropriate median value. In particular, the following worksheet function has been added.

HAMPEL(R1, nlen, xsigma); returns a column array of the same size as the column array or cell range R1 containing the time series data. Each outlier is replaced by the associated median in a moving window of size 2*nlen+1 centered on the outlier. nlen defaults to 3. An outlier is defined as more than the equivalent of xsigma (default 3) units from the median of the window (actually more than 1.4826*MAD units from the median).

See Hampel Outlier Detection and Filtering for more details.

Weighted percentiles

The existing MEAN and MED worksheet functions can be used to calculate the weighted mean and weighted median. The MED function has now been enhanced to also calculate the weighted percentile as specified by p, which takes a value between 0 and 1

MED(Rd, Rw, p) =  weighted percentile for the data in Rd based on the weights in Rw and the percentile specified by p. If p is omitted it defaults to .5 (the weighted median). 

Graph of y = f(x)

A new Graph y = f(x) data analysis tool has been added to create graphs of functions y = f(x) where f(x) is expressed using Real Statistics’ lambda capability.

For example, you can define the function f(x) = ex – x2 +3 via the formula =Exp(A1)-A1^2+3 and use the new data analysis tool to draw the graph of y = f(x) from x = -1 to x = 2.

MAD-based outliers

A new option has been added to the Descriptive Statistics and Normality Testing data analysis tool that identifies potential outliers based on the MAD statistic. E.g. any data element in R1 that is more than 2*MAD(R1) distance from the median MEDIAN(R1) is flagged.

New lambda functions

The following new worksheet functions have been added based on Real Statistics’ lambda capability.

EVALArray(expression, Rx); returns the value of f(X) where f is a multivariate function defined by expression and Rx is a column array or cell range containing the values for X.

GRADIENT(Rx, expression, incr): returns a column array with the gradient of f(X) where f is a multivariate function defined by expression and Rx is a column array or cell range containing the values for X. incr is as defined for DERIV (default 0.000001).

Minimization functions

MGRADIENT(expression, Rx, learn, iter, prec, incr): returns a column array with the value of X which minimizes f(X) using gradient descent where f is a multivariate function defined by expression and Rx is a column array or cell range containing the initial values for X. learn is the fixed learning rate (default .1), iter (default 500) is the maximum number of iterations, prec (default 0.0000001) determines when to terminate the algorithm, and incr (default 0.000001) is as for DERIV.

MGRADIENTX(expression, Rx, param, iter, prec, incr): returns a column array with the value of X which minimizes f(X) using gradient descent where f is a multivariate function defined by expression and Rx is a column array or cell range containing the initial values for X. This function is similar to MGRADIENT except that the learning rate is not fixed and instead uses the reduction rate parameter param.

BGFS(expression, Rx, param, iter, prec, incr): returns an array whose first column contains the value of X which minimizes f(X) using the BGFS algorithm where f is a multivariate function defined by expression and Rx is a column array or cell range containing the initial values for X. This function is similar to MGRADIENTX except that it uses the more sophisticated BGFS algorithm instead of gradient descent. As a result, the default for iter = 100.

The second column of the output contains the gradient, whose values should be close to zero if convergence has been reached. The third column contains the minimum value of f(X) in row 1 and the # of iterations in row 2. The remainder of the output consists of the estimated Hessian.

Bug fixes and minor changes

  • Fixes bugs in the MauchlyTest and JNSTest worksheet functions caused by a recent change in the default order of eigenvalues
  • Corrects a bug in the Mac version of the Weighted Moving Average data analysis tool
  • Fixes the MSQRT(R1) worksheet function so that it returns the square root of v when R1 is a 1 × 1 matrix with the value v.
  • Minor changes have been made to the CVTEST function. 
  • Revised the BOXTEST function to output an N/A error when a determinant takes the value zero, resulting in an error since LN(0) is undefined. This also affects the Discriminant Analysis data analysis tool.
  • The sqrt mse field in the output from the SARIMA data analysis tool has been replaced by rmse. The value is now the square root of the previous value.

7 thoughts on “Real Statistics Release 9.2”

  1. WHY WHEN I USE THE BNORMSDIST FUNCTION FOR AN X=0.253347103 , Y=-0.253347103 AND RHO= 0.34459301
    I GET P= 0.791747345 ?

        • Hello Antony,
          Thanks for bringing this problem to my attention. There is indeed a bug. But it only occurs in very exceptional circumstances.
          In particular, the bug occurs in the BNORMSDIST(x, y, rho, cdf, don) function only when x = -y; x ≠ 0; cdf = TRUE; rho is not 0, 1, or -1; and don = TRUE. This means that the bug doesn’t occur when using the Genz algorithm.
          I plan on issuing Rel 9.2.1 in a few days. This release will fix this bug.
          Thanks for all your support. I appreciate, as always, your help in improving the accuracy and user experience of the Real Statistics capabilities.
          Charles

Comments are closed.