Welcome to the Real Statistics Using Excel website. As described on the Home page, this website provides the resources (tutorials and statistics software) necessary to perform statistical analysis in the Excel environment.
In order to make this website more valuable to the user community, we welcome your comments, suggestions and feedback. We would especially like to hear about any errors in the website, examples or software. You can contact me as described below or by leaving a comment below or on any of the webpages.
Charles Zaiontz
My email: czaiontz@gmail.com, info@real-statistics.com
My profile: click here
Hello
I’ve just run the post hoc test (Games-Howell) along with a single factor ANOVA test, but I’ve got some odd results the post hoc test. I’m assuming that it’s because it’s not calculated the standard error, which is ultimately being driven by the values in the column labeled “C”, but I don’t know what this relates to, and is completely empty. Can you help, what should this contain? I’m not able to locate the formula for this test.
Hello,
You simply need to place the values +1 and -1 in the c column for two groups that you want to compare. These are the contrast coefficients. You can compare any two groups without increasing experimentwise error. This is explained on the website.
Charles
Thanks – can you point me to the place where this is explained on the website. Also is there a way of comparing all the days against each other (was looking to compare two days)
See Unplanned Comparisons
The following page may also be helpful
Planned comparisons
Charles
Pui-Shan,
See the following two webpages:
Planned Comparisons
Unplanned Comparisons
Charles
Dear Charles
Could I use K_S test on p value 0.01 ?
Thanks a lot
Sani
Sani,
Yes you can. See the following webpages for more information:
Kolmogorov-Smirnov Test for Normality
Kolmogorov-Smirnov Single Sample Test
Kolmogorov-Smirnov Table
Charles
Dear Charles,
I wonder if you could help me on the followin:
” I have conducted a survey using likert scale – asked questions to a specific group, and received their feedback on my questions. My goal is to understand how perception of these group of people relates to my depepndent varible (DV). So, through Q1…..Q2 I would like to devise a conclusion/relationship to my DV. I am using EXCEL, and I believe, I should be using logistic regression as my data’s are ordinal.
I wanted to do the linear regression first ( I know, I should not be doing this), just to get familiar with the process. It requires two variables as an input to perform linear regression. As I only have Q1…..Q2, I am not sure how to figure out the second variable. In my head, Q1….Q2 are just questions.
Any help would be apprecited.
Dear Monjurul,
For linear regression you need at least one independent variable (IV) and a dependent variable (DV). This is also true of logistic regression (except that the dependent variable only takes two possible values). Do your questions provide some way of establishing a relationship between the IV and the DV?
Charles
Dear Charles,
I have sent you a mail – I wonder if you could look at it once time permit?
Thanks in advance
Monjurul,
I have looked at the document that you sent me, but it doesn’t really help me very much. You need to do something first.
Before deciding on which tests to use or how to run those tests, you need to decide what is the objective of your study and what hypotheses you want to prove or disprove. You don’t need to state these initially in statistical terms. In fact initially it may be better to state these in terms that are consistent with your domain of study/research (although these statements should be clear and potentially quantifiable).
Only you can do this part.
Charles
Hey Charles,
Just to notify you, I have taken a good look over your site and have sent you an email. If you would like I could post my question up here
Hamed,
Ok. I will take a look at your email.
Charles
Dear Charles,
Thank you so much, I can’t express he grateful i am, I just didn’t want to spam your page in the first place, looking forward to hearing from you soon! Thank you again
Hamed,
I will take a look at your email and get to you.
Charles
Charles,
Is it possible to use your functions with VBA code? Excel maths functions can be used with WorksheetFunction.Correl(x,y) for example. However, when I try it with your functions such as WorksheetFunction.durbin(x,y) it doesn’t work.
Anyway of getting your functions to work in the macro VB code?
Thanks,
Joseph
Joseph,
See the instructions on the webpage
Calling Real Statistics Functions in VBA
Also for Real Statistics functions, you should use durbin(x,y) instead of WorksheetFunction.durbin(x,y).
Charles
I downloaded the EXCEL file Real Stat Using EXCEL for EXCEL 2007.
When running it gives error: Can’t Find Project OR Library.
Pl. advise
Prof S Chandrasekhar
Director Business Analytics
IFIM Business School B”lore INDIA
The most common reasons for getting this message are:
1. The software was not installed properly. To find out if this is the problem, press Alt-TI and see whether RealStats is on the list of add-ins with a check mark next to it. If you don’t see RealStats on the list, you need to press the Browse button and locate where you stored the Real Statistics software on your computer.
2. You are trying to open up the RealStats file (containing the software). You should not do this. Simply open any other Excel file.
3. There is some other problem. In this case, please insert the formula =VER() in any cell on a blank spreadsheet and let me know what value you get.
Charles
Dear Charles,
First of all, thank you very much for all those helpful calculations in Excel.
There is one statistical test for which I did not find any help. It is the Cochran`s C test for homogeneity of variances. Could you perhaps help me, how to do the test statistics in Excel?
Thanks a lot, Anke
Sorry Anke, but at present I don’t support Cochran’s C test since I find Levene’s test to be more useful. Cochran’s C test tests to see if the variance of one of the samples is significantly different from the variance of the other samples. Thus it really tests that the one variance is an outlier compared to the others.
Charles
Any way to get Odds Ratios from the excel linear regression program?
Jordan,
You can get the odds ratio for logistic regression, but I am not familiar with the use of the odds ratio for linear regression.
Charles
Hi
I installed the package 3 days back, everything was working fine. Suddenly there is some issue appearing MATRIX OPERATION – CORRELATION especialy when i am going to create matrix on new page. it is giving error as run time error and matrix is not created. However it is working if i select any particular cell in the same sheet
Hi,
I have just retested the Correlation option of the Matrix operation data analysis tool and it works fine.
If you send me an Excel file with your data, I will try to figure out what is going wrong.
Charles
Hi Charles
I wanted to reference this page, could you let me know what year this page was created
Thanks
Hi Salma,
The referenced page was created in 2013. Other pages on the website were created in 2014 and 2015.
Charles
Dear Sir,
I am busy with research for my PhD and have been asked by the University to test the objectivity of my research art effects. The idea is to make 6 short films and then have them tested by 4 peer reviewers. I believe Fleiss’ Kappa will be the best to do that, but I’m rather unsure how to set it up. I think that I will have the reviewers watch the films and then respond to a questionnaire with seven closed-ended, scaled check box type questions. I will appreciate your input on this.
Kind regards,
Willie Bouwer
Willie,
The following webpage describes how to perform Fleiss’ Kappa.
Fleiss’ Kappa
Charles
I’m no pro in statistics and am looking for a simple way to calculate the sample size when using a cohen’s kappa.
Denise
Denise,
It really depends on how you have stored your data. Generally in Excel you calculate the sample size using the COUNT function.
Charles
Hi Charles! I’d like to ask. What statistical technique should I use if I would like to check if there is a significant difference on the effect of the treatment and control group if I only have three trials but with ten specimens per trial.
It really depends on what you mean by specimen and trial, and how you conduct the experiment. E.g., this could be a two factor ANOVA with factor A having two levels: Treatment and Control and factor B having three levels, one for each trial. This would mean a sample of 60 specimens (2*3*10).
Charles
Hi,
I have Microsoft Excel for Mac ver 15.11.2 from 2015 and have installed the Real Statistics Add-On for Mac but am not able to find Frequency Table in my data analysis tools. Where should I be looking??
Karen,
First make sure that the Real Statistics add-in has been installed properly by inserting the formula =VER() in any cell. If you get the version number of the software, then that is a good sign.
You won’t find the Real Statistic data analysis tools combined with the standard Excel data analysis tool. Instead, you can access the Real Statistics data analysis tools by entering Control-m (i.e. control key + m) or clicking on the Real Statistics menu.
Charles
Charles,
Is there a way to remove the ctrl+m shortcut? I frequently use this shortcut for other macros and don’t mind navigating to the add-ins page to access the Real Statistics menu. Thanks for the great tool!
Annie,
I’ve been looking for a way for you to disable the Ctrl-m shortcut for Real Statistics. Unfortunately, I haven’t found a simple way for you to do this.
Instead, I will provide such a capability in the next release of the software.
Charles
Annie,
This capability has been added to Release 4.4, which was released earlier today. See Blog for details.
Charles
Your web page very useful appreciate it.
How should I site/refer your web page (I am using Harvard referencing style).
Thank you,
Based on the rules described on the wbepage http://is-uwa.v1.libguides.com/content.php?pid=43218&sid=328596, I suggest the following regarding Harvard referencing style:
Zaiontz, C. 2015 Real Statistics using Excel. Statistical Analysis. Available from: <https://real-statistics.com>. [1 July 2015].
The reference for a particular webpage is
Zaiontz, C. 2015 Title of webpage. Available from: <https://real-statistics.com/rest of URL>. [1 July 2015].
The reference for the software is
Real Statistics Resource Pack, computer software 2015. Available from: <https://real-statistics.com>. [1 July 2015].
Charles
Dear Sir
thank you very much for your support through your valuable web page. I have a doubt. i did multiple choice test with individual insect providing 4 choices and recorded the data on amount of time spent in each treatment and also number of entries in to each treatment. Now i wrote the paper, reviewers are asking me to do non parametric analysis. Will you please advice me on this. i did ANOVA, the comment is since it is violating ANOVA assumptions (independency of samples), i have to use non parametric tests etc. Sir, pl advice me which analysis i can do? thanks in advance.
If you performed one-way ANOVA with a fixed factor, then the usual nonparametric test is called Kruskal-Wallis.
If you violated the independence of samples assumption, you should simply use a different form of ANOVA, namely ANOVA with repeated measures. The usual nonparametric version of this test is called Friedman’s test.
Charles
Thank you very much for your reply. In that case should i use the data (amount of time spent in minutes, number of entries) directly or should i convert them in to ranks and then subject to Freidman test or Repeated measures of ANOVA. Pl advice!
Kamala,
If the data meets the assumption for ANOVA then use the data directly. If not Friedman’s test will convert the data to ranks.
Charles
thanks a lot Sir!
I will have to check this out. If not, then I will go with Excel. I realize that you cannot be all things to all programs.
Thanks.
Fred
I am re-learning statistics. My interest is real property valuation and analysis.
My question is, “Will your Excel add-ons work with LibreOffice spreadsheets?” LibreOffice is an open-source product that seems to be very compatible with the Microsoft Office programs (Word, Excel, and the like).
Thanks.
Fred Cull
Fred,
I have never used LibreOffice, but I doubt it will work since I don’t believe that LibreOffice supports VBA, which is the programming language that I used.
Charles
Dear Charles,
I just came across your website when I was searching for information regarding intra-rater reliability. I find your website very informative and useful.
Do you have any tools or recommendations on how to evaluate intra-rater reliability of One rater across a set of patients with multiple trials (around 6) done per patient?
Thanks a lot,
Praveen
Sorry, but I don’t know how to calculate intra-rater reliability with one rater and multiple trials. Perhaps someone else in the community has some suggestions.
Charles
Dear Charles,
Thank you for your prompt response.
Kind regards,
Theo
Charles,
My sincere congratulations for the amazing work you are doing. It has been extremely helpful for me. The tools are amazing and the tutorial very clear and didatic.
Thank you a lot!
Pedro
Dear Charles,
I can not thank you enough for putting this website and resource pack together. I work for an important international organization, where due to budget cuts, SPSS has unicorn like qualities. Being able to use these statistics in Excel has been a lifesaver for us and I have made it part of the induction kit for our department.
Yours truly,
A very greatly international civil servant!
Thank you so much for making a palatable statistics website. I am definitely sharing this with all of my coworkers.
Only issue with the software that I have encountered:
Calculating the sample size needed for a given power level in a 2 sample t-test.
All the fields are filled out properly and occasionally I get an “Overflow” error.
I am not professing to be an IT expert, so if this is an issue with my PC, then I will have it addressed by our internal IT folks.
Thank you again!
Elise
Elise,
Can you give me a specific example where you got the overflow message? For the 2 sample t test I need to know the effect size, power desired, # of tails, alpha, sample size ratio and sum count (if you changed it from the default of 40). With this information I can try to figure out what is causing the overflow and possibly fix it.
Charles
I tried posting this question, but it doesn’t seem to be showing up in the feed. My apologies if this ends up being a duplicate question. Anyway, I am having trouble with the Cluster Analysis data analysis tool. Specifically, I am unclear how to format my data, and I can’t seem to find any explanations on the website. Any chances there might be an explanation page added in the near future? Thanks!
Mike,
Sorry that you weren’t able to see your comment. This was because I lost my Internet connection this weekend and so hadn’t the opportunity to approve your comment for publication yet. Please see my response to your original comment.
Charles
I am trying to do K-Means cluster analyses using Real Statistics. I am unclear how to go about formatting entering my data, and I can’t seem to find anything on the website describing how to use the “Cluster Analysis” data analysis tool. In any case, thanks for creating this Add-In.
Mike,
The reason that you haven’t seen anything on the website about cluster analysis is that I haven’t yet issued the cluster analysis capabilities for general release. I plan to do this shortly (Release 3.9). The fact that some of the cluster analysis capabilities were included in the Release 3.7/3.8 software has indeed caused some confusion. Sorry about this; I plan to explain how to use the cluster analysis tools sometime next week.
Charles
Hello
i am trying to make a mathematical model by Linear regression model and at end i calculated R value(regression coefficient).but i don’t know it is meaningful or not. tell me how i can know that it is a good regression coefficient?
Ali,
R is a correlation coefficient, and so takes values between -1 and 1. The closer R is to 1 the better that the linear regression model fits the data.
Charles
Charles
I would like to ask a question regarding data analysis. I am working on some data having 5 to 6 independent variables and 3 dependent variables. The data is panel with large n and t=10. The whole data is divided into two groups depending on the presence or absence of one of the independent variable in individuals. This division has made the groups unbalanced. Can you please suggest which data analysis tool would be appropriate in judging out the efficiency of the two groups (which of the two groups is better) and the relationship between dependent variables and independent variables.
Sorry, but I don’t understand your question.
Charles
I was telling that I am working on panel data having large n and t is 10 years. There are 6 independent variables and 3 dependent variables. One of the independent variable is discrete dummy variable. Like for example, whether a committee is present is a company or not is the observations. The observations found out that for some years, it was present and for other years it was not present. This was observed for n number of companies. Number of presence and absence of the committee is not same for all companies. Then I divided the whole data into two groups to find out whether the presence or absence of committee has any effect on my dependent variable or not. This has made both groups unbalanced, having different t for n. Individually I can run regression on both groups. But I wanted to know how can I compare two groups. Is there some better method or technique to find out the results.
Nidhi,
I have read your comment several times, but I don’t understand it. For one thing I don’t know what panel data is.
Charles
I am confused. I submitted a question on goodness of fit a few days ago and it appears that the question was deleted with no response.
The question hasn’t been deleted, but I haven’t had time to respond yet.
Charles
By transforming the x values as (LN(x)^2.8) + 2.8
By transforming the y values as (LN(y))
By setting the target cell to the Pearson Product Moment=-.993
By using solver we can improve the Pearson to -.998148
By inventing two additional poker hands we can fill in the gaps
The result is a continuous poker rating equation that can be useful in a decision support process for players
Left out a step…
By setting the target cell to the square of the Pearson moment we eventually get a perfectly straight line with
R^2 = .999_
Dr. Charles,
This is a world-class web site. Even my associates in the Malcolm Baldrige community can’t argue with me.
Dr. Stephen Druley
Charles,
Thank you! I’ll have my tester re-download the software and try again. In his most recent email to me he mentioned something about labels – maybe not all of the slots are labeled in a way that Jaws can read? I’m not sure. I’ll ask him to contact you to explain, if the problem still exists with the latest version.
Megan
Charles,
Thank you for your work in re-doing the tab order for dialog boxes. I have some new accessibility info from my tester. He looked at the T-test and nonparametric equivalents dialog box. My tester said that, in order for Jaws to be able to read items in the dialog box to him, they have to be labeled. Everything that has a radio button or a check box is labeled (including “column headings included with data” check-box, “one-sample,” “two paired samples,” and “two independent samples” radio buttons, and “non-parametric,” “use ties correction,” and “include exact test” check boxes). Probably the “t-test” check box is also labeled but my tester didn’t mention it at all. These radio buttons & check boxes are grouped into themes that you and I can see (“options,” “non-parametric test options,” and “test type”) but there is no such group labeling that Jaws can see. Finally, all fill boxes (input range 1 box, input range 2 box, alpha box, hyp mean/median box, output range box) are not labeled – that is, he can tell that a fill box is there and can tell what is pre-filled-in, but does not know what the fill boxes are for. He wasn’t able to tell me what you would need to do to create labels for the fill boxes – he doesn’t do that kind of programming – but he said that Real Statistics was very close to being accessible to text readers. I would say that getting labels on fill boxes is most important, whereas getting group labeling is probably less critical. I assume that dialog boxes for other statistical tests have the same pattern of labeling, but I do not know for sure since my tester only looked at the T-test dialog box. Thanks again for your work in making it possible for folks with visual impairments to use Real Statistics!
Megan,
I am a bit confused. Check boxes like “one-sample” in some sense have a label, namely the words “one-sample”. What qualifies as a label? Inside the program all the fields have a label.
Charles
I’m a little confused, too, as I don’t do this kind of programming. I think the check-boxes and radio buttons all have labels that JAWS can read aloud. Somehow the fill boxes where you can type in cell ranges do not have any labels that JAWS can read. There must be something different about how you get Excel to put a fill box into your customized dialog box (as opposed to a check-box or radio button), and how you label them. Alternately, it’s possible JAWS has a glitch… I don’t know. As always, thanks for taking a look at this.
Megan,
The check-boxes, radio buttons and command buttons (e.g. the OK button) all have “captions”. These are what you are calling “labels”. The text boxes and refedit controls are the “fill boxes” and these don’t have captions. All of these “controls” are reachable using the Tab key.
In addition to these controls is a control called a “label”. All the fill boxes have a label control to their left. These label controls do have a caption. While currently these label controls are not reachable via the Tab key, I can change this so that they are reachable via the Tab key. In this case when JAWS reads this caption out loud the next tabbed control will be one of the fill boxes (in fact this fill box can be identified by the caption of the label control).
Another possible alternative approach is for JAWS to read the “control tip” instead of the caption (or at least read the control tip for a control without a caption). The fill boxes all have control tips.
Charles
Dear Charles,
Wouah impressed, yesterday I was looking desperately for some examples of multi reader multi cases Kappa statistics examples but not in SAS and not in SPSS software programing languages. And I found your fantastic website and the webpage:
https://real-statistics.com/reliability/fleiss-kappa/
Not a big deal but there are some wrong references in two formulas:
in case H9 for s.e.: it should be =B20*SQRT(SUMPRODUCT(B17:E17;(1-B17:E17))^2-SUMPRODUCT(B17:E17;1-B17:E17;1-2*B17:E17))/SUMPRODUCT(B17:E17;1-B17:E17)
in case B18 (it should be B19) for k1: it should be =1-SUMPRODUCT(B4:B15;$H$4-B4:B15)/($H$4*$H$5*($H$4-1)*B17*(1-B17))
Best regards
Cédric Marchessoux
Dear Cedric,
Thanks for catching this. I had changed the worksheet, but forgot to update the formulas in the table. I have now updated the webpage using the formulas that are actually used.
Charles
I will be teaching a statistics course in Spring 2015 and am looking for statistics software that is accessible for visually impaired folks who use text-reading software (such as Jaws). Jaws reads words available on an active window (such as options on menus) out loud for the user, and allows users to advance to and choose the desired option via keyboard shortcuts. I’ve been told that Excel is accessible but did not know if your tools were or not. Do you happen to know the answer to this?
I came across your web site in my search for possible alternatives to SPSS and SAS University Edition. We have found that SPSS is not fully accessible, and are in the process of testing SAS University Edition. It would be great to have an alternative to SAS, since I am teaching social work students who would find the task of learning the programming language an additional burden on top of learning statistics. Another advantage of your software is that students would still be able to use your Excel tools once they are graduated and do not have access to university computers. But of course your tools would have to be fully accessible for all students or I could not use them for the class. Thank you for your help.
Megan,
I don’t know whether my software is accessible using Jaws or other text-reading software. You can download and install my statistics software in a few minutes and test it out yourself. Since all the user-related mechanisms used in my software are standard Excel, there is good chance that it will work if Jaws works with Excel. Let me know whether it works and let me know whether I need to modify anything to get it to work.
Charles
Thanks, Charles.
I’ll have the tester at our school’s accessibility office test it out – it would be great if it would work! I’ll let you know.
Megan Petra
Charles,
Your software is almost (but not quite) accessible with Jaws. My understanding is that Jaws will read through options in a dialogue box, advancing from one thing to another via the tab key and choosing options via the space key.
Try tabbing through the software’s t-test dialogue box. You’ll notice that, when you first call up the t-test dialogue box, the cursor is automatically active in the box where you enter the cell range for the first variable. If you tab, however, it doesn’t take you to the entry box for the second variable. It eventually gets there, but you have to tab through the rest of the dialogue box first. If the tab-order were logical, however, I could tell my student exactly how to move through a dialogue box and ask for a t-test (or whatever statistical procedure I was teaching). Some of the dialogue boxes I checked had logical tab orders and some did not.
I think that is the only barrier to accessibility. I don’t know if changing the tabbing order in dialogue boxes is possible for you – but if it was, I think that might make Real Statistics accessible. Thanks for your consideration.
Megan Petra
Megan,
I would be happy to change the tab order. I have been fixing this of late, but apparently I have missed a few of the dialog boxes.
Can you help me by giving your opinion about the following:
1. If there are two input fields (with two Fill buttons) should the tab order be (a) Input 1, Fill 1, Input 2, Fill 2 or (b) Input 1, Input 2, Fill 1, Fill 2 or (c) Input 1, Input 2 with Fill 1 and Fill 2 much later in the tab order?
2. Should I generally use a tab order that goes from the top to the bottom? If so should I skip items (e.g. Alpha) that generally are not changed?
3. Should the OK, Cancel and Help buttons be the last items in the tab order?
Also please let me know of any dialog boxes that are in the correct order.
Charles
Megan,
I have revised the tab order on the dialog boxes in the latest release, Rel 3.2.2 (issued today). Please let me know if this order is acceptable.
Charles
Zaiontz: your tools that you are sharing with the world are so wonderful! thanks again. i feel ungrateful for even asking . . . but are you considering adding some clustering algorithms (k-means, &c) in the future? (that’d be so awesome!)
Good to hear that you appreciate the tools. I do plan to add clustering algorithms in the future.
Charles
Dear Charles,
I most confess that I am a huge fan of your website, the work you have done it’s simply amazing. Thank you so much for making your knowledge available to everyone.
Greeting from México.
Thank you very much Antonio,
Charles
Mr. Zaiontz,
You have saved me so much time and energy with your real statistics add on package! Thank you so much for such an excellent tool! I am very grateful!
Dear Charles,
Thanks for your response. I do get 2.16.2 when I check the version. However, when I launch the add-in either from the add-in menu or by ctrl-m I still get the error message.
Apparently, it is not possible to reinstall the add-in, since this requires a password.
Thanks for your advice,
Sam
Sam,
This seems strange to me. The add-in must be operational since you got 2.16.2. For some reason ctrl-m is generating an error message. I am using Excel 2010 on my computer without any problems, as well as many many other users.
You never need the password. This is only needed to modify the software and not to use the software. I suggest that you delete the software from your computer and start all over again. I plan to issue a new release of the software this week (Release 2.17). Hopefully the problem doesn’t reappear.
Another approach is to use the version of the software for Excel 2007, and see whether you get the same problem.
Charles
Charles
Dear Charles Zaiontz,
I have followed your instructions for installation but when I try to run your macor (ctrl-m) and choose an analysis, I get an #424 error message (under excel 2010) saying “object required”. When I tried to remove the macro, it requires a password.
Thanks for your help.
Sam
Sam,
It sounds like the Real Statistics Resource Pack was not installed properly. To find out for sure, enter the formula =VER() in any cell in an Excel worksheet. You should see the version number of the Real Statistics Resource Pack that you downloaded (e.g. 2.16.2). If you don’t see this, then the software is not recognized as an Excel add-in (which is what the installation instructions are intended to do). See the webpage https://real-statistics.com/free-download/real-statistics-resource-pack/ for more details. The key step is described in Figure 2 on that webpage.
Charles
Great work! Crisp and clear solutions to the example problems.
Do you have any plans to add support for regression trees or random forests ?
Kumar,
Thank you very much. I will add your ideas to the list of new potential enhancements. It is likely that these will be added at some future time. Stay tuned.
Charles
Dr. Charles Zaiontz,
Your website and resources have been very helpful. I have taken econometrics in college but I am having some trouble. I have a large data set of customers with when they ordered products as well as the amount they ordered. I used basic excel linear regression to predict the amount of an order the model has a R squared of 92%. I used the same excel linear model to predict time between orders with a R squared of 95%. I want to predict if someone will order in the next X days as well as the amount they will order. What statistical package/ method do I use for this?
Any help would be much appreciated.
John,
You can create two separate regression models using EXCEL or other tools. If you want a single regression model with two dependent variables then you probably want multivariate regression.
R provides this capability.
Charles
Hi Charles,
I’ve forgotten a question: are the new functions (WPROB, MPROB, SRankPROB, functions for power & sample size, etc…) documented in more detail (synthax) anywhere on the website?
Theo
Theo,
All the new functions are documented on the webpage https://real-statistics.com/excel-capabilities/supplemental-functions/. Additional details can be found in various other webpages. Soon I plan to modify the https://real-statistics.com/excel-capabilities/supplemental-functions/ webpage so that you can click on any supplemental function and get additional information about that function.
Charles
Re: Repeated measures Anova-
Hi Charles,
Can you tell me where you deal with the repeated measures Anova, for one intervention and one control group, pre and post-test?
The paired-sample t-test takes care of the pre-post, but how to take simultaneously effect of the control-intervention issue?The research question is whether the mean change in the dependent variable from pre to post is different.
If your package covers this case, can you kindly indicate how to input the data?
Many thanks!
Hi Leslie,
Repeated Measures Anova is explained at the following webpage: https://real-statistics.com/anova-repeated-measures/
To access the data analysis tool for repeated measures Anova (assuming the software is installed on your computer), press Ctrl-m and select Repeated Measures Anova on the dialog box that appears.
Charles
Hello Charles,
The last bugs I have reported (no reporting at all of critical values for non-parametric tests, problems when using either the Dunn/Sidak or Bonferroni correction) occur also with Windows7/Excel 2010.
I have installed the Resource Pack on 3 computers, all running with Windows7 64 bit OS and having Excel 2010 installed:
• One machine with 4 GB RAM runs under Windows 7 Home Premium
• A notebook with 4 GB RAM runs under Windows 7 Professional
• Finally, a workstation with 16 GB RAM running under Windows 7 Professional (my computer at work).
The problems reported occur on the first two machines while the Resource Pack runs properly on the third machine. Also the permissions on the three machines appear to be the same. The only striking difference is the RAM.
Kind regards,
Theo
Theo,
This is very strange. Since your original message I have tested the software on three computers all running Windows 7 with 4 GM RAM but with Excel 2007, 2010 and 2013. Only the last one had the problem.
I am going to make a small change in the next release (Rel 2.15) and see if it makes a difference. I expect to issue this release either tomorrow or Wednesday.
Charles
Theo,
I have tried to correct the Dunn/Sidak or Bonferroni problem in the latest release (Rel 2.15). Do you see any change?
Charles
Dear Charles,
I think I have found the reason for the bugs, but so far I have checked just for Excel 2010: as soon as on the Windows level the decimal separator is set to period (.) and the group separator to comma (,) the tools work well, even for the previous release. The problems occur only if in the OS settings this is the other way round, i.e., decimal separator as ” ,” and group seprator as ” .”. In this case the problems persist in a different way.
As soon as I’ll be back to work I will check it also for Excel 2013. I’ll send you a more detailed report of my test runs next week by email.
Kind regards,
Theo
Hello Charles,
I’m using periods (.) for decimal representations.
Also, comming back to reporting critical values (T-crit, U-crit) for the nonparametrical tests: with Excel2013 /Windows 8 these are not reported at all (not even for small sample sizes).
Best regards,
Theo
Thanks Theo for your response. Very strange indeed. I need to figure out what is going on. Charles
Hi again Charles,
On issues:
1- I found the package raeding this section…ctrl m.
2- On the Manova, I need this repeated measures Manova because I have the same subjects taking pre and post-tests. Have you developped this test?
I may also need a two-factor Manova because I have two factors, age (in months, these are children (5-6 years) and group (contrôle and test).
Thanks.
Leslie,
I have developed Anova repeated measures capabilities, including pre and post tests. Although some of these may be useful for Manova follow-up testing, I have not as yet developed specific Manova repeated measures or two factor Manova capabilities yet. I plan to add these in the future.
Charles
Hello Charles,
First of all, Thank you!! for all your work. It is great that you have decided to help us with statistics and have made available your add-in.
Now, the issues (sorry):
1- The add-in is checked but it does not show in Data.
2- I did a pre-post test and have to run a Revised measures anova (possibly manova), is it possible to do with Excel and your package? I have no local access to SPSS or other package.
Many thanks!
Hello Leslie,
Thanks for bringing up the two issues. I am pleased when people make suggestions or find errors. This only makes the website better. Regarding your two issues:
1. As you mentioned in your next comment, you can access the data analysis tools by pressing Ctrl-m. I would still like to make it available from one of the ribbons or quick access table, but so far I haven’t been able to figure out how to do it.
2. I am not sure what you mean by “Revised measures anova”. In any case The Real Statistics Resource Pack provides a Manova data analysis tool. You can get more information about Manove at https://real-statistics.com/multivariate-statistics/multivariate-analysis-of-variance-manova/.
Charles
Hi Charles,
I have observed that the boxplot option from the resource pack gives slightly different results compared to other tools from the internet, GraphPad or SPSS. More precisely it’s the lenght of the box (i.e., the IQR) which is different. Considering for example the data 45, 46, 49, 54, 55, 53, 47, 48, all other tools give an IQR of 7.5, whereas the resource pack gives 6.5. The reason is that the resource pack uses QUARTILE.INC instead of QUARTILE.EXC (I have verified this with EXCEL). QUARTILE.INC includes the minimum and the maximum values as the 0-th and the 4-th quartile, respectively, and so these values are taken out when calculating Q1 and Q3. Therefore, I consider QUARTILE.EXC more apropriate for IQR calculation.
All the best, Theo
Hi Theo,
It is interesting that you bring this issue up at this time. I am about to make the changes that you suggest. I plan to put these in the next release. Thanks for your suggestion.
Charles
Hi Charles,
Thanks a lot! You are really doing a fantastic job! With your Resource Pack Excel indeed becomes competitive to “professional” statistical packages. Many statistical tasks which can’t be done with EXCEL in a straightforward way (i.e., boxplots, but also many others) become available with your tools. This is extremely benificial, as EXCEL still is the most widely used data analysis tool in many companies and universities.
Thanks again and much success for the future!
Theo
Theo,
Thanks for your very kind remarks.
Charles
Hello Charles,
I have an additional question and a remark which possibly can help improve your tools. I am really very much interested in removing possible bugs, since I extensively use Excel for statistical purposes and when I found the Resource Pack on your website it was a real treasure trove.
The question: the 2-sample t-test for independent samples reports critical t-values, t-crit, for both the equal variance and the unequal variance case. However, only one value for the test statistic, t, namely that calculated for the unequal variance case, is reported. Wouldn’t it be more appropriate to report both t-values, as these are different for unequal sample sizes? For the following data (the numbers measuring some effect of a drug) I compared your tool with the corresponding standard Excel add-ins tool (confidence level 95%):
Drug A: 10.5, 7.8, 9.3, 8.7, 10.2, 8.9, 7.4, 9.3, 8.7, 7.9
Drug B: 11.2, 10.3, 9.2, 9.7, 8.9, 10.7, 9.9, 10.1
Resource Pack: test statistic t: 2.723;
Equal variance: t-crit (one): 1.746
Unequal Variance: t-crit (one): 1.753
Excel Add-in: Unequal variance: test statistic t: 2.723; t-crit (one): 1.746
Equal variance: test statistic t: 2.634; t-crit (one): 1.746
Since the test statistic for the equal variance case is different, I think it should be also reported.
A second slight differences between the standard Excel add-in and the Resource Pack is t-crit for the unequal variance case (1.746 versus 1.753). This difference results from the fact that within the Resource Pack the number of degrees of freedom used to calculate t-crit is obtained by truncating the value of the Welch–Satterthwaite equation [=T.INV(1-0.05;TRUNC(15.96))= 1.75305036], whereas the Excel Add-in does this by rounding [=T.INV(1-0.05;ROUND(15.96;0))= 1.74588368].
The remark, finally, concerns the one sample tests: the one sample t-test works well regardless whether the sample data are in a row or in a column, for the non-parametric alternative the sample data has to be in a column, otherwise the result is wrong (i.e., only one value from the sample data is considered resulting in count=1).
Kind regards & all the best
Theo
Hello Theo,
Thanks for persisting with this. You are right. I was convinced that I had calculated the t-stat correctly, but you are correct that there is a bug in the software. I will be correcting the software shortly. Thanks very much for catching these errors.
Charles
Charles:
I cannot thank you enough for creating this site and sharing your knowledge. This was so very helpful for me during a statistics course which required the use of excel.
Thanks so much,
Kris from the US
Charlie,
I have just added Cholesky Decomposition support to the Real Statistics Resource Pack. It is function CHOL in Release 2.13.
Charles
Charles
A stunning website, well done.
I have one suggestion for a minor addition: the Yates algorithm. It is so useful when analysing blocks.
regards
Ivor
Ivor,
Glad you like the website. I will look into the Yates algorithm.
Charles
Really helpful website..keep up the good work