Real Statistics Data Analysis Tools

Although all the statistical analyses described on this website can be done with standard Excel capabilities, it is often easier to use the supplemental functions and data analysis tools provided in the Real Statistics Resource Pack.

The functions provided in the Real Statistics Resource Pack are summarized in Real Statistics Functions. Here we briefly review the available supplemental data analysis tools.

Accessing Real Statistics Data Analysis Tools

You can access the Real Statistics data analysis tools by pressing Ctrl-m or via the Add-Ins ribbon (as described in Accessing Real Statistics Tools). One of two dialog boxes will appear which lists all the available supplemental data analysis tools (see Figures 1 and 2).

Original main menu

Figure 1 – Original Real Statistics data analysis tools main menu

Main menu (multipage format)

Figure 2 – Real Statistics multipage data analysis tools main menu

You next choose one of the data analysis tools from this list. A dialog box will now appear which is similar to that presented in Figure 2 of Excel’s Data Analysis Tools, as described in detail in Using Real Statistics Data Analysis Tools.

The following are the currently supported Real Statistics data analysis tools.

List of Real Statistics Data Analysis Tools

You can click on any of the data analysis tools listed below (or the listed options within these tools) to get additional information about that tool (or option). The data analysis tools are listed in the tab order of the multipage interface.

Descriptive Statistics Tab

Descriptive Statistics and Normality: includes:

Frequency Table Descriptive Statistics

Histogram with Normal Curve Overlay

Kernel Density Estimation

Diversity Indices

Homogeneity of Variances

ROC Curve and Classification Table

Step Chart

Extract Columns from Data Range – allows you to create a new data range by selecting certain columns from an existing data range

Reformatting a Data Range

  • Reformat Data Range, includes:
  • Reshape – reproduce input data in a range of a different shape
  • Reverse – reverse the order of the input data range
  • Sort – reproduce the data in ascending sorted order
  • Sort without duplicates – reproduce the data in sorted order removing duplicates
  • Shuffle – randomly permute the data (selection without replacement)
  • Randomize – randomly select data from a data range (with replacement)
  • Remove blank cells – remove all empty cells
  • Remove non-numeric cells – remove all cells with non-numeric data
  • Remove error cells – replace any error cells with empty cells
  • Copy exact formulas – copy a range to a new location without changing the cell addresses

Reformatting a Data Range by Rows

Matrix Operations

  • Matrix Operations, includes:
  • Transpose – transposes a matrix
  • Correlation Matrix – creates the correlation matrix for the specified data range
  • Covariance Matrix – creates the population or sample covariance matrix for the specified data range
  • Inverse – inverts a square matrix
  • Diagonal – creates a diagonal matrix with specified diagonal
  • Eigenvalues and Eigenvectors – finds the eigenvalues and eigenvectors of a symmetric square input matrix
  • QR Factorization – finds an orthogonal matrix Q and upper triangular R where the input matrix = QR
  • Schur Factorization – finds an orthogonal matrix Q and upper triangular T such that the input matrix = QTQT.
  • SVD Factorization – finds the Singular Value Decomposition (SVD) consisting of diagonal matrix D and orthogonal matrices U and V where input matrix = UDVT.
  • Spectral Factorization – finds the Spectral Decomposition of a symmetric input matrix which is equal to QDQT where D is a diagonal matrix and Q is an orthogonal matrix
  • LU Factorization – finds the LUP Decomposition of a square input matrix, which is equal to PTLU where P is a permutation matrix, L is a lower triangular matrix and U is an upper triangular matrix
  • Hessenberg Factorization– finds the Hessenberg Decomposition of a square matrix, which is equal to QTHQ where Q is an orthogonal matrix and H is a Hessenberg matrix.
  • Identity Matrix – creates an identity matrix of the specified size
  • Empty Box – creates an empty matrix of the specified size and shape

Change Chart Axes Bounds

Multiple Scatter Charts

Color Assignment

Restore Real Statistics on Add-ins Ribbon

Regression Tab

Regression, includes

Multiple Linear Regression, includes

Other types of Regression 

Autocorrelation

Logistic and Probit Regression

Multinomial Logistic Regression

Confidence and Prediction Interval Plots

ANOVA Tab

Single Factor Anova, supports Excel and standard formats and includes:

Two Factor Anova, supports Excel and standard formats and includes:

Three Factor Anova, supports standard formats by row and by column and includes:

One Factor Repeated Measures Anova, includes

Two Factor Repeated Measures Anova

Two Mixed Repeated Measures Anova, includes

Three Mixed Repeated Measures Anova, includes

Nested Anova

Randomized Complete Block Design, supports Excel and standard formats and includes:

Split-plot Design, supports Excel and standard formats and includes:

Latin Squares Design, supports Excel and standard formats and includes:

2^k Factorial Design, supports Excel and standard formats and includes:

Two Factor Anova with Replications Follow-up

Two Factor Anova without Replications Follow-up

Anova Follow-up (esp. used after Three Factor ANOVA)

Analysis of Covariance (Ancova)

Friedman Test

Time Series Tab

Basic Time Series Forecasting

ARIMA Modelling and Forecasting

SARIMA Modelling and Forecasting

Time Series Testing

Forecast Accuracy

Correlogram

Mann-Kendall and Sen’s Slope

Cointegration

Cross Correlations

Panel Data Analysis

Multivariate Analysis Tab

Hotelling’s T-square Test

Single Factor MANOVA

Two Factor MANOVA

Multivariate Repeated Measures Analysis

Factor Analysis – includes correlation matrix, eigenvalues/vectors, factor matrix, Varimax rotation, factor scores

Cluster Analysis 

Discriminant Analysis 

Correspondence Analysis 

Confidence Ellipse

Permutational MANOVA

Correlation, Reliability, and Missing Data Tab

Correlation (one-sample), includes

Association Tests

Internal Consistency Reliability

Interrater Reliability

Item Analysis

  • Item analysis (item difficulty, item discrimination, and point-serial correlation)

Rasch Item Response Analysis

Polychoric Correlation

Multiple Imputation (MI)

Full Information Maximum Likelihood (FIML)

Estimation-Maximization (EM)

Neural Networks

Coding an Image

Miscellaneous Tab

T Tests and Non-parametric equivalents

Non-parametric Tests

Chi-square Test for Independence – tests m × n contingency tables for independence, data can alternatively be in standard (stacked) format

Cochran-Mantel-Haenszel Test

Goodness of Fit Testing

Distribution Fitting

Resampling 

Proportion Tests

Multiple Tests

Solve Set of Linear Equations; includes:

Spline Curve Fit

Bayesian Tools

Survival Analysis, includes

Network Diagram

Statistical Power and Sample Size Requirements 

This data analysis tool supports the following tests:

  • One-sample normal test
  • Two-sample normal test
  • One-sample and paired-sample t-test
  • Two-sample t-test
  • One-sample binomial test
  • One-sample correlation test
  • One-sample variance test
  • Two-sample variance test
  • Chi-square test (goodness of fit and independence tests)
  • One-way ANOVA
  • Multiple regression
  • Logistic regression (normally and binomially distributed independent variables)
  • Cronbach’s alpha test
  • Intraclass correlation
  • Cohen’s kappa

66 thoughts on “Real Statistics Data Analysis Tools”

  1. Dear Charles,
    I’ve tried to calculate the ADC in accordance with 8.3.2.2(j) of CMH-17-1G, but its result is unexpected. Do you have a solution? thanks.

    Reply
  2. May I ask where can I find the z test two sample for means in your add-in in excel?
    It would be wonderful if I can receive a reply due to my upcoming research. Thank you very much!

    Reply
  3. Hi

    Is there a way to use real statistics to calculate the minimum required sample size for a study to determine the sensitivity and specificity of a diagnostic tool?

    Reply
      • I guess he wants you to add functionality for splitting the data into Training Set and Testing Set to make it work like Data Analytics tools used for forecasting.

        Reply
        • Supposing that your sample size is n and you want to use 10% of the data for testing. You just need to randomly select 10% of the data to be in the testing set and the rest will be placed in the training set. One way to accomplish this is to add a sequence number from 1 to n to each row in the data set. You can then use the Real Statistics SHUFFLES function to shuffle the values 1 to n. The first 10% of the resulting rows would be used for testing and the others would be used for training.
          I will look into adding this capability to Real Statistics.
          Charles

          Reply
  4. Dear Professor,

    First of all thanks for this great add-in.
    How can I work with a base where data needs to be weighted first? because of a complex sample? I know how to do the statistic but I have not yet find an option “weight by”. Thanks!

    Reply
  5. Like the program abilities but after 1st installation the program shows up on addins but does not run.

    Tried all the methods listed but have to unintall and reinstall to get it to work.
    Also shows password when using macro method.

    Is there a way to stop the program not working or reinstalling every time I get out of excel.

    Reply
  6. Hi

    I had a problem with the functions such as QSORT not being recognised with the #NAME appearing. Actually I had to go to the original downloaded file and click “unblock” since the AddIns had also disappeared.

    Great site- I am going enjoy using it

    Reply
  7. Thank you very much for making this suite available to us. Appreciate your efforts (and the “educational” spirit) in crafting a value added option to a popular workhorse for the masses.

    Reply
  8. Respected sir,
    I want find the “relationship between emotional intelligence and math abilities of secondary school students.”
    how can i compare emotional intelligence componets &with what type of math abilities get little confusion. piz help in this regards.

    Reply
    • Mulla,
      You haven’t provided enough information for me to respond in any detail. You could start by exploring the correlation coefficient. Perhaps a t test would be useful.
      Charles

      Reply
  9. I have downloaded the resource pack for Mac many thanks
    The control M produces the real stats menu but it does no contain a survival analysis option.. I am trying to perform cox proportional hazards model – why can’t i access survival analysis from menu??

    Reply
  10. Thanks for a great tool. I am a NOVICE stat guy just trying to crunch some numbers for fantasy football. The Jenks Natural Breaks algorithm is great for determining the cutoff value for tiers, or what I think you call classes. However, it relies on me telling it how many classes (k) that I want. Is there a way to determine the optimal k? Is it always the case that the larger k is, the better the fit would be? So I guess I would be looking for the lowest k that does not significantly improve with a larger k. If that makes sense, how could I statistically calculate that. I am looking at less than 100 samples, so the number of permutations will not be impossible. Thanks again.

    Reply
    • Mike,
      Yes, the larger the number of classes the better the fit would be. In fact, the best fit would occur when each data element is in its own class, but this defeats the purpose of the whole exercise.
      I don’t know of a statistical test which determines whether a higher number of classes no longer yields a significant difference. With a small sample, you can experiment until you see that a higher number of classes doesn’t seem to matter much. Usually after about 6 classes, the Jenks Natural Breaks algorithm slows down considerably (assuming that you want to test all possible partitions) and so this may be the determining factor in how many classes to consider.
      Charles

      Reply
  11. Dear Professor,

    First of all thanks for this great add-in.
    But, who write a Tesis, they have to put/write (at least in APA) the p-value.
    But, in some tables, special Contrast, Tukey HSD (I use .. )etc, you only put the “sig” and “no” or “yes”.

    It is not possible to put the real p-value in next version?

    Thank you, very, very much!
    José

    Reply
    • Jose,
      That is a reasonable request. I am in the process of testing the next release and I will try to add this information. In the meantime, for Tukey HSD, you can use the QDIST function to get the p-value.
      Charles

      Reply
  12. Hi Charles, thanks for your Excel add-in.
    This is very beneficial for me as I am not fortunate enough to have Minitab or SPSS in my official system.
    I have installed Real Statistics as per instruction given on site. but when I am trying to use a function by pressing Ctrl +m and selecting the function and data an error comes “Compile error in hidden module: frmMatrix ”
    This is happening with each and every function.
    please help.

    Reply
  13. Came across this hidden gem . I’m a data scientist and mostly play around with R. But this is helpful for my colleagues who are not comfortable with R and not fortunate enough to have Minitab or SPSS in their machine. At least its a stop-gap arrangement where we could run statistical tests locally before communicating findings.

    Thanks

    Reply
  14. Hi,
    I am seeing in other posts there is a cluster analysis included but when I download this version I do not see it? Do I need an upgrade?
    Nicole

    Reply
  15. Hi Charles, thanks for your Excel add-in.
    The option of logistic regression does not appear in the dialog box .
    I have the 2013 Excel.
    Thank you again.

    Reply
  16. I downloaded your add-in and I playing with it now. I have a question, not directly related to the add-in – Excel contains the following functions SUMSQ, SUMX2MY2, SUMX2PY2 and SUMXMY2, what practical applications do these functions have – where would on use them?

    Thanks,
    Shane

    Reply
    • Shane,

      SUMSQ is used very often in statistics: to compute variance, Durbin-Watson, regression, length of vector, etc.
      SUMXMY2 is used for calculating Durbin-Watson and in K-means cluster analysis

      Since SUMX2MY2(R1, R2) = SUMSQ(R1) – SUMSQ(R2) and SUMX2PY2(R1, R2) = SUMSQ(R1) + SUMSQ(R2), they seem much less useful to me.

      Charles

      Reply
  17. Hi Charles. Thanks for the availability of this excel add-in. I have attempted to run the SlopeTest function (Excel 2013) and the output is only on cell, the Std Err value. There are no other cells reported. Any suggestions? Thanks

    Reply
  18. Charles, the toolbar opens, but whe I try to run any function it returns the same error as if I access trough the supplement.

    Reply
  19. Charles, I get the software version: 3.8.1 Excel 2007. When I try to access the realstat tool bar it returns a system error &H80004005 (-21476467229). Non specified error and then a compilation error in the hidden module frmlnput.

    In fact, I can use the functions directly (and in fact I do use them this way), but turns the process unproductive.
    Andre

    Reply
    • Andre,

      This seems to be a known bug in Excel. Most people are using the Excel 2007 version of the Real Statistics software without any problems. A few people have reported some problems getting access to the software, but as far as I am aware this the first time this bug has been reported.

      If you google “Excel error message &H80004005 (-21476467229)” you will see information about the bug. The webpage http://peltiertech.com/unspecified-painfully-frustrating-error/ seems to summarize the situation well.

      Since the bug is only indirectly related to the Real Statistics software, if at all, I don’t know what to suggest. Please let know whether you were using a previous release of the software, in which I can send a copy of that version to you.

      Charles

      Reply
  20. Hi, I’m using a brazilian portughese version of excell and installing the realstat2007 version. But when I try to access the resource pack it returns a non specified error and indicates that it may probably be due to a problem running the hidden module. Do you have any clue on how to solve this? Thanks a lot.
    Andre

    Reply
    • Andre,
      I don’t know why you are getting this error message. What do you get when you enter the formula =VER()? In the error message does it state which hidden module is causing the problem?
      Charles

      Reply
  21. I followed your directions and got to the dialogue box in fig. 2 but each time I send a command it shows a notice; “Compile error in hidden module: Analysis”. How can I solve this problem Sir. Dear Charles, VER() gives 3.6.2. My excel is 2007.

    Reply
  22. Dear Charles, VER() gives 3.1.2. I have installed the 2010 version on another pc, windows server 2008 r2, and it seem to work. Thank you.

    Reply
  23. Thank You for your great job. I get “Compile error in hidden module: Analysis” while using the “ctrl-m” part of the module. My excel is 2007, SO is Windows XP.
    Marco

    Reply
  24. I followed your directions and got to the dialogue box in fig. 2 but each time I send a command it shows a notice; “Compile error in hidden module: Analysis”. How can I solve this problem Sir.

    Reply
    • Charles
      Which version of Excel are you using? Which version of the Real Statistics Resource Pack are you using? (you can find this out by enterring the formula =VER() in any cell).
      Charles

      Reply
  25. It is a very useful website. How do I calculate the CORNALPHA using the Resource Pack? I do not think there is an option if and when I press Ctrl+m

    Regards

    Reply

Leave a Comment