Real Statistics Release 7.8 | Real Statistics Using Excel

I am pleased to announce Release 7.8 of the Real Statistics Resource Pack. The new release is now available for free download at Download Resource Pack for Excel 2010, 2013, 2016, 2019 and 365 in both Windows and Mac environments.

Examples workbooks will be updated shortly for compatibility with the new release. Over the course of the next few days, the website will also be updated for compatibility with the new release.

If you are getting value from the Real Statistics website or software, I would appreciate your donation to help offset the costs of the website by going to Please Donate.

The following is an overview of the new features in Release 7.8:

Order Statistics

The Real Statistics Resource Pack now supports the following worksheet functions. These functions refer to a distribution dist (“uniform”, “normal”, etc.) with the specified parameters as described for the existing MEAN_DIST and VAR_DIST functions.

ORDER_DIST(x, k, n, cum, dist, param1, param2, param3) = the pdf f(x) for the kth order statistic x_(k) from a sample of size n for the specified distribution if cum = FALSE and the corresponding cdf F(x) if cum = TRUE.

For example, the formula =ORDER_DIST(7,5,11,TRUE,”laplace”,10,5) takes the value .1576. Thus, for a random sample of size 11 taken from the Laplace distribution with mu = 10 and beta = 5, the cdf F(7) = .1576 for the 5^th order statistic. This means that 15.76% is the probability that the 5^th order statistic from a sample of size 11 taken from the specified Laplace distribution is less than or equal to 7.

ORDER_INV(p, k, n, dist, param1, param2, param3) = the inverse of the kth order statistic from a sample of size n from the specified distribution at p

E.g. =ORDER_INV(.1576,6,11,”laplace”,10,5) is 7.

ORDER_MEAN(k, n, iter, dist, param1, param2, param3) = the expected value for the kth order statistic x_(k) from a sample of size n for the specified distribution.

This function uses Simpson’s rule to calculate the area under a curve based on iter iterations (default 100).

For example, the formula =ORDER_MEAN(6,11,,”laplace”,10,5) takes the value 10 as expected since the 6^th order statistic is the median, which for a symmetric distribution is the same as the mean which in this case is equal to the mu parameter of 10.

ORDER2_DIST(x, y, j, k, n, cum, dist, param1, param2, param3) = the joint pdf at (x, y) for the jth and kth order statistics when cum = FALSE and the corresponding cdf when cum = TRUE

RANGE_DIST(w, j, k, n, cum, dist, param1, param2, param3) = the pdf at w for the range x_(k) – x_(j) when cum = FALSE and the corresponding cdf when cum = TRUE

Confidence Intervals for Order Statistics and the Median

The following array functions have been added:

ORDER_SIM(k, n, lab, iter, alpha, dist, param1, param2, param3): returns a column array with the estimated value of the kth order statistic based on iter (default 1000) simulated samples of size n from the distribution specified by dist with the specified parameters; the output also contains the standard error of the estimate along with the 1-alpha confidence interval (default for alpha is .05).

ORDER_BOOTSTRAP(R1, k, lab, iter, alpha): returns a column array with the estimated value of the kth order statistic based on the sample in R1 using bootstrapping with iter iterations (default 1,000); the output also contains the standard error of the estimate along with the 1-alpha confidence interval (default for alpha is .05).

If k = 0 (default) then the output estimates the median instead of the kth order statistic. If k < 0 then the mean is used. If lab = TRUE (default FALSE), then a column of labels is appended to the output.

RANGE_SIM(j, k, n, lab, iter, alpha, dist, param1, param2, param3): returns a column array with the estimated value of the range x_(k) – x_(j)based iter (default 1000) simulated samples of size n from the distribution specified by dist with the specified parameters; the output also contains the standard error of the estimate along with the 1-alpha confidence interval (default for alpha is .05).

MEDIAN_CI(n, lab, alpha, dist, param1, param2, param3): returns a column array with estimates of the population median, standard error of the median and 1-alpha confidence interval based on a sample of size n for the specified continuous distribution using the Sample Median Theorem.

Finally, the following array function has been added.

ORDER_CI(k, n, lab, alpha): returns a column array that estimates the 1-alpha confidence interval around the kth order statistic for samples of size n using a binomial distribution approach.

Three such confidence intervals are specified: the interval that is at least as big as 1-alpha, the next smaller interval and an exact 1-alpha confidence interval that interpolates between these two.

If k = 0 (default) then the output estimates the median instead of the kth order statistic. If lab = TRUE (default FALSE), then a row of labels is appended to the output. The default for alpha is .05.

Pareto Distribution Support

The following new functions have been added where alpha and mn are the parameters of the specified Pareto distribution.

PARETO_DIST(x, alpha, mn, cum) = the pdf of the Pareto distribution f(x) when cum = FALSE and the corresponding cumulative distribution function F(x) when cum = TRUE.

PARETO_INV(p, alpha, mn) = the inverse of the Pareto distribution at p

Data can be fit to a Pareto distribution by using the new functions PARETO_FITM (method of moments) or PARETO_FIT (MLE approach). Also, the PARETO_CONF function has been added to calculate the standard error and confidence interval of these estimates.

The existing MEAN_DIST and VAR_DIST functions now also support the Pareto distribution.

New Random Number Support

The following new function has been added.

RANDX(n, seed): returns a column array of size n consisting of non-volatile random numbers between 0 and 1.

If seed ≤ 0 or is omitted then no seed is used, while if it is a positive value, then this value is used as a seed. A seed can be used to generate a repeatable sequence of pseudo-random values.

E.g. to obtain an estimate of the mean of a beta distribution with α = 4 and β = 6, you can perform a simulation of size 10,000 via the worksheet array formula =AVERAGE(BETA.INV(RANDX(10000),4,6)) to obtain a result such as 3.99901, which is close to the theoretical value of α/( α+β) = 4/(4+6) = .4. You can also use a seed such as in the formula = AVERAGE(BETA.INV(RANDX(10000,123),4,6)) to obtain .39985.

As usual, when using Excel 365 you can press the Enter key to activate such formulas, while in other versions of Excel you need to press Ctrl-Shft-Enter.

Solving Complex Linear Equations

Extends existing capabilities for solving a system of linear equations to those that have complex number coefficients. The following new functions have been added.

ZLinEqSolve(R1, prec): an array function that outputs the solution vectors for the system of linear equations defined by the augmented matrix found in the complex array R1. Similar to LinEqSolve. The key difference is that complex number solutions are now supported.

IMLinEqSolve(R1, prec): an array function that outputs the solution vectors for the system of linear equations defined by the augmented matrix found in the array R1. Similar to ZLinEqSolve. The key difference is that the entries in both the solution and R1 use Excel’s complex number format.

Complex Eigenvalues

Extends existing capabilities for finding eigenvalues and eigenvectors of square real-valued matrices to include support for complex eigenvectors. The following new functions have been added.

ZEigVAL(R1, check, iter, prec): returns real and complex eigenvalues for the square matrix in R1. Similar to the existing eigVAL function. The key difference is that the order of the eigenvalues is different.

ZEigVECT(R1, check, iter, prec): returns real and complex eigenvalues and eigenvectors for the square matrix in R1. Similar to the existing eigVECT function. The key difference is that complex eigenvectors are included and repeated eigenvalues are handled even for non-symmetric matrices.

IMEigVECT(R1, check, iter, prec): returns real and complex eigenvalues and eigenvectors for the square matrix in R1. Similar to ZEigVECT. The key difference is that complex eigenvalues and eigenvectors are written in Excel’s complex number format.

ZEigMultVECT(R1, re, im, prec): returns the eigenvectors for the eigenvalue re + im*i, which is presumably a repeated eigenvalue. Similar to eigMultVECT. The key difference is that both real and complex eigenvalues (and eigenvectors) are supported.

Additional Complex Matrix Support

The following are new functions concerning complex matrices in Real Statistics format.

ZReal(Z) = the real part of the complex matrix Z

ZImag(Z) = the imaginary part of the complex matrix Z

ZSet(R, C) = the complex matrix whose real part is R and whose imaginary part is C

ZLen(Z) = the length of the complex vector Z

ZNorm(Z) = the complex matrix Z where all the column vectors in Z are normalized

New Bayesian Statistics Function

The following new function has been added.

NORM_GAMMA(μ₀, φ₀, n₀, m, φ, n, lab) = returns a column array with the posterior values of the mean, variance and size based on the NormGamma prior with parameters μ₀, φ₀, n₀ and sample data of size n with mean m and variance φ.

If lab = TRUE (default FALSE) then a column of labels is appended to the result.

Bug Fixes and Miscellaneous

Fixed a bug in the BNORMSDIST and BNORMDIST functions that support bivariate normal distributions
Fixed a bug in the TNORM_INV function that calculates the inverse of the truncated normal distribution.
The default for the iter argument in the ROOTS function has been changed to 200.
The processing time for the CountRowsUnique function has been speeded up.

12 thoughts on “Real Statistics Release 7.8”

ANTONY ABANTO CHAVARRI

September 7, 2021 at 3:34 am

DOCTOR GOD BLESS YOU, BECAUSE I ALWAYS GET ERROR FOR LLMAX, R -SQUARE, AIC AND G-SQUARE IN POISSON REGRESSION MODEL?
GREETINGS FROM PERU-TRUJILLO
- Charles
  
  September 7, 2021 at 9:13 am
  
  Antony,
  Can you email me an Excel file with a Poisson regression model that produces these types of errors?
  Charles
  - ANTONY ABANTO CHAVARRI
    
    September 7, 2021 at 9:26 pm
    
    Doctor I just sent the file to your gmail, blessings
    - Charles
      
      September 8, 2021 at 9:52 am
      
      Thanks. I will take a look at it shortly.
      Charles
ANTONY ABANTO CHAVARRI

August 29, 2021 at 8:09 am

I was researching, how to find h4 (upper control limit) for multivariate EWNA chart (MEWMA), when the data is in subgroups, but I can’t quite understand the algorithm.
annex algorithm information:
http://lib.stat.cmu.edu/jqt/31-1
- Charles
  
  August 30, 2021 at 7:25 am
  
  Sorry Antony, but I am not familiar with this algorithm.
  Charles
  - ANTONY ABANTO CHAVARRI
    
    August 31, 2021 at 7:01 pm
    
    Would there also be the possibility to include the negative binomial count regression model in the program?
    - Charles
      
      September 1, 2021 at 7:21 am
      
      Hello Antony,
      Yes, I intend to add this capability, but I don’t have a specific date. For now, Poisson regression can often be used for this purpose. See
      https://www.real-statistics.com/poisson-regression/
      Charles
      - ANTONY ABANTO CHAVARRI
        
        September 6, 2021 at 5:03 am
        
        Could you also implement equimax rotation?
      - Charles
        
        September 6, 2021 at 11:20 am
        
        Hi Antony,
        It is already on the list of potential future enhancements.Thanks for adding another vote for this enhancement.
        Charles
ANTONY ABANTO CHAVARRI

August 27, 2021 at 6:01 am

IS THERE A POSSIBILITY THAT THE PROGRAM INCLUDES MULTIVARIATE CONTROL CHARTS?
- Charles
  
  August 28, 2021 at 8:13 am
  
  Hello Antony,
  Is this the sort of thing that you are looking for?
  https://bookdown.org/lawson/an_introduction_to_acceptance_sampling_and_spc_with_r26/multivariate-control-charts.html
  Charles

Comments are closed.