Contact Us

Welcome to the Real Statistics Using Excel website. As described on the Home page, this website provides the resources (tutorials and statistics software) necessary to perform statistical analysis in the Excel environment.

In order to make this website more valuable to the user community, we welcome your comments, suggestions and feedback. We would especially like to hear about any errors in the website, examples or software. You can contact me as described below or by leaving a comment below or on any of the webpages.

Charles Zaiontz

My email: czaiontz@gmail.com, info@real-statistics.com
My profile: click here 

1,088 thoughts on “Contact Us”

  1. dear sir,
    i’ve a doubt about the conclusions of one sample wilcoxon test. after applying the test on my samples for significance level of 5%, i obtain the p vlaue 0.45 and corresponding z score is 0.123. If i’m not wrong, on the basis of obtained p value, there are very less chances to reject null hypothesis (kindly correct me if i’m wrong at any point). what else information i can conclude from my these results.

    with regards

    Reply
  2. Charles

    The Table above is displayed concerning the Abdi & Morin CV´s compared with those issued from my own routine (see ANNEXE)

    Abdi & Morin 1E5 samples/size
    .01
    .05
    Samples/
    / size
    MY A&M
    n .01 .05
    50 .1450 .1245 4E6 .1457 .1246
    100 .1036 .0889 .1026 .0888
    150 .0850 .0729 .0840 .0727
    200 .0738 .0633 .0729 .0630
    250 .0662 .0567 1E6 .0652 .0564
    300 .0605 .0519 4E5 .0596 .0515
    400 .0525 .0450 .0516 .0447
    500 .0471 .0404 .0462 .0400

    Remember that it was proposed
    f_(n )=(.83+n)/√n-.01

    CV(.05)=0.895/f_n CV(.01)=1.035/f_n

    [1] Hervé. Abdi, Paul .Molin, Lilliefors/Van Soest´s test of normality
    _____________
    The other confidence levels (20%, …) are completely useless, nobody uses that,
    no guaranty in NHST.
    __________________________
    Conclusion

    Owing to an error in the expression that transforms Z into p-value I wrongly assumed that the Abdi & Morin fitting expressions, never those evaluated by Monte Carlo, was strongly sub-evaluated. After revising my routine I could verify that the error was mine as it could be verified by the Table where my VC´s values were recalculated
    I hope my apologies would be accepted.

    Luis

    Reply
  3. Charles says:
    January 29, 2017 at 11:54 am

    Sure, it would be fine . . .
    What you propose people should do in this instance?
    I think that more people are needed to surpass the deadlock.
    Do you think so, Charles?

    luis

    Reply
  4. Charles

    That´s the point . . .
    How can I be judge and jury my own sake?
    It was precisely the reason I invited you to recalculate the Critical
    Values in order that you be sure who is right, Abdy or me.
    Never mind, remember “Nemo index in causa sua”
    Science always does depend on cross-validation, without it, crashes. . .

    Luis

    Reply
  5. This is a particular e-mail

    Charles

    Now can I conciliate the Abdi & Molin
    D(1000,.95)=0.0283 with my evaluation using 400´000 samples /size
    n= 250, 0.0722
    500, 0.0727
    750, 0.0737
    1000, 0.0742

    One of us is deadly wrong, Abdi or me, for sure.
    Do you agree?

    Luis

    Reply
  6. Hi, Charles

    Monte Carlo Method (MCM) and me

    Since more than 4 decades that personal computers (PC) are currently available to everyone, I started to be concerned in its usefulness in Statistical matters. Because they aimed to control depending on the chance procedures and also to educationally interiorize randomness, MCM is of undisputable value in practical and learning issues. Here is an example.
    We know from Theory that the extreme values, a matter of importance in several branches as Climate Changes, for example, are, for the Uniform (0,1) Distribution:

    P(min≤x)=1-(1-x)^n
    P(max>x)=1-x^n

    Follows the results for samples sizes 20,30, 50, 1 million samples per size:

    ___n=20__/1´000´000
    ___min.___max_________observed__ (theory)_
    __0.005__0.995_____0.0953__0.0952 (.0954)_
    __0.050__0.950_____0.6416__0.6412 (.6415)_
    __0.100__0.900_____0.8788__0.8786 (.8784)_
    __0.200__0.800_____0.9884__0.9883 (.9885)_
    __0.300__0.700_____0.9992__0.9992 (.9992)_

    ___n=30__/1´000´000
    ___min.___max_________observed__(theory)_
    __0.001__0.999_____0.0296__0.0296 (.0296)_
    __0.005__0.995_____0.1397__0.1397 (.1396)_
    __0.050__0.950_____0.7854__0.7854 (.7854)_
    __0.100__0.900_____0.9574__0.9578 (.9576)_
    __0.200__0.800_____0.9987__0.9988 (.9988)_

    ___n=50__/1´000´000
    ___min.___max_________observed__(theory)_
    __0.001__0.999_____0.0488__0.0488 (0.0488)_
    __0.005__0.995_____0.2217__0.2217 (0.2217)_
    __0.050__0.950_____0.9231__0.9229 (0.9231)_
    __0.100__0.900_____0.9948__0.9950 (0.9948)_
    __0.200__0.800_____1.0000__1.0000 (1.0000)_

    Therefore, we reach a two-fold conclusion:
    The Theory is correct as everybody knows,
    And my RNG is quite acceptable quality. which was of course the true intent.

    Luis

    Reply
  7. Hi Charles
    I did obtain the Quantiles of the Geary Test (goodness of fit) concerning
    Normal Data.
    Is it possible to display the following Table in order to have feedback from
    the Readers, to confirm that the values are acceptable for practical purposes?
    1E7 simulated samples for each sample size.
    Thank you.

    0.80 0.85 0.90 0.95 0.98 0.99
    n

    20 .847 .855 .864 .878 .892 .901
    25 .841 .848 .857 .869 .883 .891
    30 .837 .813 .852 .863 .876 .884
    35 .834 .840 .847 .858 .870 .878
    40 .831 .837 .844 .854 .866 .873
    45 .829 .834 .841 .851 .862 .869
    50 .827 .832 .039 .848 .859 .865
    55 .826 .831 .837 .846 .856 .862
    60 .824 .829 .835 .844 .853 .860
    65 .823 .828 .834 .842 .851 .857
    70 .822 .827 .832 .840 .849 .855
    75 .821 .826 .831 .839 .848 .853
    80 .820 .825 .830 .838 .846 .852
    90 .819 .823 .828 .835 .843 .849
    100 .818 .822 .826 .833 .841 .846
    120 .816 .819 .824 .830 .837 .842
    140 .814 .818 .822 .828 .834 .839
    160 .813 .816 .820 .826 .832 .836
    180 .812 .815 .819 .824 .830 .834
    200 .812 .814 .818 .823 .828 .832

    Reply
  8. Hi Charles,

    First of all, I would like to thank you for this wonderful website and codes. I started learning econometrics and your help if invaluable.

    I have several case studies in the form of panel data. I was wondering if we can apply your logistic regression function to panel data?

    Thanks for your reply.

    Arnaud

    Reply
    • Arnaud,
      I am very pleased that you value the website and software.
      I have not yet delved into panel data on the website and software. I hope to get to that in the future. The logistic regression capabilities described on the website and implemented in the software are only the beginnings of the approaches used with panel data.
      Stay tuned, more to come.
      Charles

      Reply
  9. Goodmorning Prof,
    Am Jude,i want to do my thesis on generalized approach to the Analysis of Variance.Please sir,can you direct me on how to commence going about it?Thanks

    Reply
  10. Hello,
    Would you please help me figure out the appropriate test. I have results from satisfaction surveys for 2 different companies. 40 surveys per company. I want to find out if the satisfaction is the same or different. Would I run a two sample t test with equal variances? Thanks for any guidance you can give me.

    Reply
    • Lu,
      You can run a two sample t test. If the variances of the two samples are relatively similar you can use the equal variances version of the test; otherwise you should use the unequal variances version of the test. You should also make sure that the assumptions for the t test are met; otherwise you might use a nonparametric test such as Mann-Whitney. See t Test for details.
      Charles

      Reply
  11. Dr. Zaiontz,

    Thank you. Your website is wonderful, as well as the Excel add-in. I believe I have correctly run a logistic regression on my relatively small data set (n ~140). One input, one output, pretest ~ course success (pass/fail).

    My question is: Can you recommend how I can find someone to double check my interpretation of the output? I did read your website and posted a question and if I interpreted your data set correctly, then I think I am doing my interpretation correct.

    Are there statisticians who will check my work affordably?

    Reply
  12. Dear Dr. Zaiontz,

    I just wanted to say thank you for sharing so much of your knowledge on your website. I am a math/stat major, and I have repeatedly returned to your website to get fuller, more intuitive explanations of the concepts in my stat textbooks, which usually throw formula’s at you without much explanation of their derivation.

    A thousand thanks, Dr. Zaiontz!

    Best regards,
    James

    Reply
  13. Dear Charles,
    I am carrying out a research and want to analyse the effect of some meteorological parameters (temperature, relative humidity, evaporation, wind speed..) on river flow.
    Can you kindly assist me and suggest the best statistical method for such an analysis.

    Thank you

    Reply
  14. Hi again Charles,

    Speaking of eVECTORS: the output of this array fn is a range of values, the first one being the eigenvalue and the rest being the coordinates of the (unit) eigenvector. Each time I use this function, I notice that the eigenvectors are listed in descending order with respect to the value of their corresponding eigenvalues. Is that a built-in feature of eVECTORS, such that I can trust that the output always will be organised that way (ie the highest eigenvalue first)? If so, that would simplify my calculations considerably.

    Reply
    • Hi Dan,

      The third parameter of the eVECTORS function describes the order in which the eigenvalues are listed.

      If order is TRUE or omitted then the eigenvalues are listed in order from highest in absolute value to smallest. If order is FALSE then they are listed in order from highest to lowest.

      Charles

      Reply
    • Hi Charles;

      I’m getting an error message in trying to send you an email through your published addresses. I’m in need of guidance. I’m hoping you can help. I have a sample of passengers boarding a vehicle and how far they traveled. The samples were collected every 6th day for a year. I need to be able to prove (disprove) that these samples meet a minimum of 95% confidence; and minimum precision level of +/- 10%. How would I go about proving this.? There are so many tests, I get confused with them. Thank you for your help!

      Reply
  15. Bringing to light towards dismantling it, and it is our duty to do so…

    Douglas H. Johnson
    The Insignificance of Statistical Significance Testing (1999)
    http://digitalcommons.unl.edu/usgsnpwTC/225

    1. Are Null Hypothesis Really True?

    “Most null hypotheses tested, however, state that some parameter equals zero, or some set of parameters are all equal. These hypotheses, called point null hypotheses, are almost invariably known to be false before any data are collected (Berkson 1938, Savage 1957, Johnson 1995)” [citation ended]

    My comment
    The parameter could very well be equal to zero and almost every time the observed value issued from data is different from zero: it is randomness acting. The A. is wrong, in fact we are completely unable by observation/experiment to state the true value of a parameter whatever. Using calculated Confidence Intervals for increasing sample sizes we obtain, progressively narrow intervals till a sufficiently low imprecision for practical intent is reached, in case of data homogeneity, and the test is strictly exact. On contrary with approximate test statistics it could very well happen that this final interval does not contain the true parameter value

    Note: I do not care at all though the Null is true or untrue. My intent is to choose judiciously between it and the alternative. From the test value I can discard H0 with sufficient confidence or alternatively fail to do so when p-value is larger than alpha, a value I chose to wrongly reject a really true parameter value.

    Luis A. Afonso

    Reply
  16. Hi Charles,

    Thank you for these excel-lent (pun half-intended) stats tools. They work fine for me under MacOS 10 Yosemite apart from one major snag: execution of the functions take a very long time! Like half a minute or so. I seem to remember having similar problems with my own VBA macros in Excel until I put these code lines in at the start of every function:

    Application.EnableEvents = False
    Application.Calculation = xlCalculationManual

    and then ending with reversing them:

    Application.Calculation = xlCalculationAutomatic
    Application.EnableEvents = True

    Is there any way I can make the execution work faster without putting these lines into all the macros of RealStats? I guess there is something with the prefs of my Excel that brings this problem about, and that others don’t experience it. I would be very grateful for your comments. (btw, the RealStats VBA code i password-protected – do you distribute the password?

    Reply
    • Dan B,
      You won’t be able to make these changes yourself. I don’t distribute the password.
      MY experience in the past was that the execution was relatively fast. Which function was this slow?
      Charles

      Reply
      • Hi Charles,

        Thanks for the prompt reply. Since I posted my question, I have experimented with Excel preferences, and it works like this: if “Calculation” prefs are set to “Automatic”, then execution of the RealStats function “EVECTORS()” takes like 30 s. However, if I change the Excel Calculation prefs to “Manual”, I have to prompt the execution of all calculations using “Calc Now” or “Calc Sheet”, but the actual calculation is quite fast. I don’t know why this happens specifically to me (and presumably not others, since I imagine that most people have their prefs set to automatic calculation).

        Reply
        • Dan,
          Strange indeed. Since eVECTORS uses an iterative technique it could take some time to calculate for a large matrix, but in my experience to date, it is still pretty fast.
          Charles

          Reply
          • I don’t think it is specific to evectors, or even to RealStats. As I said, my own VBA macros behaved the same way, but there I could remedy it by putting in the code lines I mentioned earlier. So it’s probably not really a RealStats issue. I rather hoped you would know the solution. I guess I will have to search for the answer somewhere else or make do with manual calc.

  17. Good morning Charles,
    I am hoping that you might be able to help me out. I have a mac and successfully manged to download the add-in and get it to run. However, when I try to run my analysis, irrespective of whether it is a t-test of a repeated-measures ANOVA, I keep getting the same error message appear “compile error in hidden module: Analysis”. I have checked all of the fields and believe I have entered and selected everything correctly, but can not seem to get rid of this error. Any assistance would be very much appreciated!

    Reply
    • I do not care, at all, if the Null Hypothesis is true or not: the NHST
      measures in what measure H0 is unlike. If so p<alpha we reject it.
      if not we fail to reject. Never say that you accept H0. In effect, we are
      completely unable to state such a thing.

      Reply
  18. Gliner, Leech, Morgan [1] says as the major problem with NHST:

    “ Kirk (1996) [2] went on to explain that was a trivial exercise because the Null Hypothesis is always false, and reject it is merely a matter of having enough power ”.

    Our comment
    The stated null hypothesis concerning a parameter, H0: p=p0 against Ha: p=p1 where p0 and p1 are real numbers, , even that the p0 exact value be practically impossible to be attained: I let out a laud laugh. When the real matter is to reject or fail to reject the null, I do not care at all this imprecision because with sufficient power (sample size) it always possible, given , to decide between the two hypothesis, so performing a rational choice. In fact the Confidence Interval (CI) once found is decisive: the test statistics inside, fail to reject H0, outside reject it. We know that a Type I error can occur, i.e. H0 is true, however we reject, or Type II, we fail to reject when H0 is untrue.

    Luis

    [1] – Problems With Null Hypothesis Significance Testing (NHST): What Do the Textbooks Say? Jeffrey A. Gliner, Nancy L. Leech, George A. Morgan. The Journal of Experimental Education, 2002, 7(1), 83-92.
    [2] – Practical significance: A concept whose time has come: Kirk R.E. Educational and Psychological Measurement. 56, 746-759.

    Reply
  19. Charles

    I am fully acquainted, you know, a set of people at war with NHST. However, items like the one I point out, borns the ridiculous.
    Such as [1]
    “ Because the proposed Null Hypothesis H0: p=p0, is practically impossible to be attained, therefore the Null Hypothesis is always false”. “If there is no expectation regarding the possible truth of Null Hypothesis its falsification by data is a redundant nonsense “. “If the probability of a point hypothesis is indeterminate, the empirical discovery that discovery that such a hypothesis is false is no discovery at all, and thus adds noting to what is already known”.

    No doubt a pearl of great price . . .

    The “author” makes confusion between“point estimation” and the practical impossibility to have certainty at the real probabilistic world.

    No matter . . . I guess that the fathers of NHST will be terrified to have found such argumentation.

    I am just starting not worry about . . .

    [1]- Null Hypothesis Significance Testing (On the Survival of a Flowed Method), Joachim Krueger (January 2001, American Psychologist, Vol. 56, No.16-26).

    Luis

    Reply
  20. Thank you for helpful to our study, Charles.

    When I used QCRIT(x,y,z) in the tools, a compile error on lookup was occurred.
    On the Excel 2010 version 14.0.7166.5000 (old one of my PCs), it was no problem, but on the version 14.0.7166.5000 (the latest of my PCs, it’s used for my study), it caused a compatibility problem.
    How can I fix the error on latest Excel?
    Or, if you can, could you construct the compatible latest edition?

    Thank you for reading.

    Reply
    • Shiggy,
      The two Excel version numbers that you provided are identical.
      Which release of Real Statistics are you using?
      What are the values of x, y and z?
      Charles

      Reply
      • Thank you for your reply, Charles
        Sorry, Excel version numbers are 14.0.7172.5000 and 14.7166.5000.
        I use Real Statistics that downloaded on Jun. 2016 for Excel version 14.0.7172.5000, and for Excel version 14.0.7166.5000, I use old one that downloaded on Nov. 2014.
        The values are x:6, y:12, z:0.05 for example, I can get the same error under the Real Statistics that downloaded on Jun. 2016 and Excel version 14.0.7172.5000.
        Shiggy

        Reply
        • Shiggy,
          I don’t know why you are having this problem. I used the formula =QCRIT(6,12,0.05) in the latest release (Rel 4.9 of July 19) and get the answer 4.75. What do you see when you use the formula =VER()
          Charles

          Reply
          • Charles
            I try the old version add-in (3.2.2) that I used for past study to Excel version 14.0.7172.5000.
            In this case, I can get a correct value 4.75 by the formula =QCRIT(6,12,0.05).
            However, about the latest version (4.9), it seems to fail to read in Excel.
            So, for this time, I will use the version 3.2.2.
            Because this problem it may be proper problem due to the software constitution of my PC, I will ask to engineer of the our university about this problem.
            Thank you so much, Charles.
            Shiggy

  21. Charles

    Would you so kind to send a comment, please?
    (This is only because some Psychologists as J. Cohen and similar fauna, does insist in swap NHST from literature, which, IMO, is too radical . . .)

    Thank you, so much

    Luis

    A reasonable meaning of H0: p=p0

    a) In a real, probabilistic world, a no matter parameter when estimate through experimental/observational data, is by proper nature, errors caring, even the estimator is no biased,

    b) Remembering, if so, the aim is to get sufficient evidence to falsify the Null Hypothesis,

    c) Read, algebraically as an equation, H0: p=p0, leads directly to the absurd that there is no room for a difference even for a last decimal unit, and consequently the null hypothesis have not chance to be true.

    Because what is intended to find out a significant result, i.e., if p is sufficiently far from p0 that, given the observed data, we can state that the opposite complementary hypothesis, Ha: p p0 is true. In short, I think that H0: p=p0 must be understood as indicating an approximate equality, not statistically discernible, given the data. The same thing for two parameter values H0: p1=p2.

    Reply
    • Luis,
      A lot of people have criticized the null hypothesis approach to statistical analysis, and with good reason. Other approaches have been proposed and are used (e.g. Bayesian approach), but the null hypothesis approach is still the one that is most commonly used and so I have adopted this approach for the website.
      Charles

      Reply
  22. Hello,

    I am analyzing wind speed data. I want to find weibull distribution of wind data. I have hourly averaged wind speed data for one year. I draw a graph for (Speed) vs ( Percentage of time). How can I find Alpha and Beta values of weibull distribution function. Thank You,

    Reply
    • Chaminda,
      If you know that the data follows a Weibull distribution, you can find the alpha and beta values that produces the minimum squared error (MSE) using Solver. This approach is illustrated in a number of places on the website. Using Solver for Logistic Regression, Using Solver for Exponential Regression and using Solver for Time Series Analysis.
      Charles

      Reply
  23. Hi, Charles.

    Thanks again for this great tool.

    Recently, when I open a new spreadsheet (also older files I’ve created using this Addin), the resource isn’t available because the macros have been disabled.

    Is this something I did or an update I’m missing? I never use to have this problem. Do you have a fix? Like I said, I have older files that make use of the resource in a template that I hope to automate.

    Thanks so much for your help.

    Michael

    Reply
    • Michael,
      I am quite pleased that you like the tool.
      I don’t know any reason why the macros would be disabled (assuming that you didn’t disable them).
      If you have updated the Real Statistics software or moved the location of the software, then perhaps you need to tell an older file where to locate the add-in. You can do this exactly as described for the Real Statistics examples workbooks on the webpage:
      https://real-statistics.com/free-download/examples-installation/
      Charles

      Reply
  24. I liked the site for the clarity, rigor and explanation of the process.
    This is an excellent site for learning.

    I wish there were examples showing potential pitfalls of using an “apparently obvious” method and/or hints in each chapter indicating, “What to Watch For”.

    Reply
  25. Dear Charles,
    I’d like to convey many thanks, once again, and many times more, if need be, for all your kind help.

    Unfortunately, following brief use of Real stats in Win 8.1, French, Excel 2007, Data section would no longer open, and would insted CRASH Excel….

    I can no longer install the add-in…. (be it 2003 or 2007 or some other one), in EITHER of the two files mentionned in other parts of this blog.

    Ἔῤῥωσθε! == keep well, in good health (take care, etc…)
    Georgios

    Reply
  26. Mr. Zaiontz,
    I would appreciate in case you’ll find time to answer my question. I’m analyzing a big number of point-sets(I would call them attributes). Each of these attribute is in XYZ format. I’m doing it to measure the evaluation of the data therefore I want to analyse it through the flow where the data are changed. So I’m generating these attributes (XYZ, 90 attributes each time) for several stages during the changing of data.

    For each attribute I have statistics summarized with the following parameters: minimum, maximum, median, mean, standard deviation, skewnes, kurtosis. I would like to create a pdf function in excel based on these parameters as an input. So although I have the data points itself I don’t want to load them into excel (as it would be too many data) I want to use these extracted statistics to plot PDFs. Do you think it possible? Could you advise the workflow for this?

    Reply
    • Ilya,
      Sorry, but I don’t know how you would be able to create a pdf based on the minimum, maximum, median, mean, standard deviation, skewness and kurtosis. This would give you the first four moments of the distribution, but not the moments after that. I would imagine you could make an estimate of a possible pdf, but I’m not sure how you would do that. Perhaps some other reader has an idea for how to do this.
      Charles

      Reply
  27. HI Charles, loving your website by the way.
    I’m working on some time series data and using the TIme Series Testing tool; my question regards the output from the ADF test. The answer is most probably written somewhere, but I can’t find it…. under stationary (yes/no), aic, bic, there is a lags box. My data at diff 1 (with trend) comes back as stationary with lags 6, does this mean the data is stationary after 6 lags? I.e. Do I need to difference the original data 6 times before I can use it for other tests and comparisons?

    Thanks in advance, I think I’m having a dumb day!

    Reply
  28. Charles – great product that I use more and more – thanks so much for making and maintaining such a great product! I recently upgraded from Excel 2013 to Excel 2016, and now am having trouble making ‘Add-Ins’ appear on the ribbon. (Real Statistics is installed as an add-in and I can access it via Ctrl-m)

    It’s not a big deal (since I can always use Ctrl-m), but I was wondering if this is just an issue with Excel 2016 or if I am not doing something right.

    Thanks again,

    Alvin

    Reply
    • Alvin,
      Sometimes the problem is the order in which the various addins need to be installed. I think the Analysis ToolPak needs to be installed first and then Solver and then Real Statistics. This may indeed be irrelevant to the problem you are having, but sometimes Excel is sensitive to these sorts of things.
      Charles

      Reply
  29. I was looking for a non-parametric test equivalent to two-way ANOVA and thanks to your website I found how to conduct the Scheirer-Ray-Hare test.
    So first, thanks a lot!!!
    Second, I couldn’t find much information about this test, but I did find a comment saying it was not very robust… but which would be the alternatives, GLMs? I bet there are many cases out there where data with two factors do not follow normality, so I’m quite surprise not to find a clear pathway.
    And third, if this test is perfectly valid, could I use any post-hoc, to check where are the differences within factors (e.g. my two factors are location and month, and the interaction is significant so I want to see in which months that happens). Any suggestions?
    Thanks a lot in advance!

    Reply
    • Just an additional question I had doubts of… If we have missing data, I see it didn’t work leaving the cell blank, or with NA, it only works introducing “0”. Is it considering it as a value?
      Thanks again

      Reply
    • Ara,
      It is true that Scheirer-Ray-Hare test is not very robust. You can use the various post-hoc tests, provided their assumptions are met.
      Charles

      Reply
  30. Hi, I just downloaded this Add-in today and I tried to use the one sample t-test (as described from your website). However, every time I enter the data range and the correct settings, I get this response when I click OK: “”A run time error has occurred. The analysis tool will be aborted. Unable to set the HorizontalAlignment property of the Range class””.

    What does this mean and how do I fix it so I can use the one sample t-test function?

    Thanks!

    Reply
    • Jane,
      If you send me an Excel file with your data I will try to figure out what is going wrong. You can get my email address from the Contact Us webpage.
      Charles

      Reply
  31. Dear Sir,

    Currently I’m working on a schoolproject on forecasting the employee turnover of an organisation. To do this, I’m using a time series data of the employee turnover over the past 7 years. It is an annual data from 2009 to 2015. To make a good prediction it is also important to add some explanatory variable.

    I want to use multiple regression with ARMA errors. But i’m not sure how to use these error to do some prediction. Do I need to make a prediction of each of the explanatory variable on it’s own first. Then add this predicted value in the model to do the prediction of the employee turnover?
    Or will the model do alle the predictions of the dependent and independent variable and give me a predicted value of the dependent variable (turnover)?
    Or do you recommend using another model to do the forecasting with all the explanatory variable?

    I am really having a hard time putting all the variables in a model that gives me a good prediction. Also, the internal data has only a few observation, whereas the external data has more historical data? How can I combine these two data to do a multiple regression?

    Is it statistically correct to use only 7 observations to do some forecasting?

    Can u please help me out sir.

    Thanks very much in advance

    Reply
    • SB,
      At present I only support time series for one variable. See Time Series Analysis.
      I will eventually add multivariate time series capabilities.
      Regarding your last question, you could make a forecast based on limited data; of course, the confidence interval of the prediction will be larger.
      Charles

      Reply
  32. hi doc,
    I must say am inspired and helped in the write up on statistics. I may say am unfamiliar with statistics and am doing a research which involves statistics..yes I have used simple excel now am finding difficult in analyzing the p-value …and do a scientic analysis. I have sent u my results of the p valus in your email. thanks in anticipation

    Reply
  33. Hi Charles,

    Appreciate your contributions to the statistics community.

    I am trying to regress roughly 150 dependent variables (150 mutual fund’s time series returns) against roughly 8 independent variables (the time series of various factors like the return from simply buying cheap stocks, or stocks that have gone up recently, etc.). Thoughts on ways to do this without drinking heavily?

    Reply
  34. I don’t have the ability to interpolate on my Excel (Mac version) and have the following values. My W is 0.984 ,n = 30 and my p-values in between 0.9 (.983) and 0.95 (.985). I don’t have the interpolating feature on Excel and saw the equation you had on that page was a little confused as to what went where to calculate it correctly. Some help would be much appreciated 🙂
    Thanks!

    Reply
  35. Hi,
    I am doing a research about GPA’s at my school. In my data sheet I have 50 samples, under each label I have less than 30 samples. I tried to use “t-test: Two-Sample Assuming Unequal Variances” because I don’t have the exact variances but I also know that I cannot use t-test if I have more than 30 samples. Since I don’t know the population variances I cannot use the z-test, too. Which test should I use in this case?
    Thank you in advance.

    Reply
    • Nur,

      Are you saying that you have 50 samples and each sample has fewer than 30 elements?

      The following statement that you made “I also know that I cannot use t-test if I have more than 30 samples” is not true. You can use the t-test for samples with more than 30 elements as well as for samples with fewer than 30 elements. You also don’t need to know the variances to use this test. I suggest that you look at the following webpage, especially the subpages about hypothesis testing

      t Distribution

      Charles

      Reply
  36. Hi, I wanna say thank you for the add-in functions you provided for us!
    However, today when I’m using function nchisq_dist(9629.82651550155,0.06365952,9414.72191791204,TRUE), it returns 0 which seems incorrect. I input the same number in matlab to verify it while matlab returns 0.8659. Since I don’t have the password for that add-in, so could u plz help me out?

    Reply
    • Grant,
      Yes, there are limitations to the use of the NCHISQ_DIST function that I have provided. In Excel CHISQ.DIST generates an error value when df < 1. Since NCHISQ_DIST uses the Excel function CHISQ.DIST, it is not surprising that it does not return the correct value when df < 1. There are also limitations when x or lambda are large. I will try to explain these limitations on the website. Thanks for identifying this problem. Charles

      Reply
  37. Hello Charles!
    I’m trying to compare two population proportions to see if there is a significant difference between the two proportions. I’m reading everywhere that I have to use the z-score to do so, but I can’t seem to find any good information on the quickest way to do this in excel. My goal would be to have a p-value in the end, telling me if the difference between the two population proportions is significant.
    Could you help me with this? Is there a page on your site on which this is explained?
    Thank you so much! Your website is very helpful!
    Isabel

    Reply
  38. Dear sir,
    Let’s suppose that someone uses a questionnaire with questions like this “How much do you think that X can influnce your life?” and the participants have to answer by choosing a number from 1 to 10. Could median be a meaningful measure in a case like this? Or the researcher should just stick to the mode?

    Thank you very much for your time.

    Reply
    • Maria,
      This all depends on the details of what you are trying to analyze. Usually the mean is used, sometimes the median, and only rarely is the mode used. But which you should use really depends on your specific research. It also depends on the test that you use. E.g. if you decide you want to analyze the data with the t test (assuming that this makes sense for your analysis and that the assumptions for the t test are met), then you probably want to focus on the mean. If it turns out that the assumptions for the t test are not met, then it would be common to use the median (and some nonparametric test) to do the analysis.
      Charles

      Reply
      • Let’s say that the researcher wants to explore the ideas of some people about how environmental problems influence their own lives. So, he has a questionnaire with e.g. 10 questions. Each question has the same form (“How much do you think that X can influnce your life?”) and the same response options. These options range from 1 to 10, where 1 means “no influence” and 10 means “great influence”. So, 1-10 are not “real” numbers; theyare just the participants’ way to express what they think.

        In this case, when the researcher knows that e.g. 5 participants have ticked the option “1” and 10 ticked “2” …. etc., and does not plan to run any test at all, is there a real meaning for him to estimate the median? I mean, if he just wants to describe this kind of data, is median really applicable and meaningful?

        Thank you very much once more.

        Reply
      • Good day Dr

        First of all I would like to express my appreciation for the invaluable real statistic package which have help a lot of people already.

        However the subject of the matter is that I am carrying out a research on “The viability of crowdfunding to meet the funding needs of SMEs” and my main focus is on the supply side of funds, the investors. I am using a questionnaire to gather the data of the investors and the online questionnaire is on the following link https://docs.google.com/forms/d/1pibVVwzBqyrL4SQ9mcmNzCiArwzXp0VKWcy8J_vJe2Y/viewform

        After gathering the data and test on the viability, i decided to consider whether the potential investors would chose to invest or not(the propensity to invest). This would constitute my dependent variable. The independent variables include, investment capacity, which is a function of income and investment pledge, expected return surplus, payback period, there are other variables that i have considered such as internet connectivity, social media presence, access to a payment system as variable of consideration of crowd-funding etc

        I thought the logistic model was going to be most appropriate model to use considering the binary nature of most of my data and i chose it.

        however i have been following your video that you shared on YouTube on logistic regression model, after coding my data. I did run the analysis but i got errors in excel about the data format.

        May you please help in how i can address the error issue, where i should look at as well. I WOULD BE VERY, VERY HAPPY if you assist on the overall project by offering comments, insights and direction as well. I believe this project is of great importance to the SMEs in my home country and the nation as a whole and i would like it to be as informative as possible.

        Accompanying this mail is the Excel sheet with the responses and the data that i am trying to analyse.

        I am in much anticipation of your response

        Reply
  39. Hi,

    Do you know if its possible to do the exponential decay 3rd order in excel (y=yo+a*exp(-bx), fits for yo, a and b? I know I can do this in Origin or Sigmaplot, but it would make it easier if I could also integrate this on my excel sheet for the calculations I need.

    Thanks!

    Reply
  40. Hi, Dr. Zaiontz!
    Have you given any thought to doing more with the Fast Fourier Transform (FFT) already provided in Excel?
    They really have not explained what to do with it after they generate the complex form. I have managed to compute the resulting magnitude (amplitude) and frequency from the complex form given but need help on how to interpret what I have…presumably the frequency spectrum. Ok, so I locate the dominant frequency after I graph the magnitude and frequency….now what? How do I go about combining the various frequencies to get a better fit and how do I come up with the parameters for the various cos and sin terms? Is there a short cut without having to construct it individually for each frequency?
    Puzzled?

    Reply
  41. I recently consulted with a statistician that informed me, I need to use Generalized Estimating Equations (GEE) on the data that I’ve been gathering.

    Does this excel add-in allow for GEE calculations?

    Reply
    • Sorry Mike, but the Real Statistics website and software doesn’t yet support GEE. This will be supported shortly.
      Charles

      Reply
  42. Prof,
    I have an dependent variable and four independent variables.
    To do regression analysis using excel 2013, should I select all the FOUR independent variable together under Input X range, or I should do regression of the dependent variable against each independent variable, thus in all, four times?
    Thanks
    Tom

    Reply
    • Tom,
      It all depends on what you want to accomplish, but I suggest that you perform the regression with all four independent variables and see which ones are not significant. These are candidates for elimination from the model.
      Charles

      Reply
  43. I want to determine whether two sets of data (n = 12, for both) are statistically significantly different at P <=0.001 (two-tailed test). Should I run Mann-Whitney U test on the raw data, or should I first do a normality check and convert the data to a normal distribution? (say, by log, square-root, or exponential conversion).

    Reply
    • Craig,
      If the data is normally distributed then generally you should use the t test instead of Mann-Whitney. You should first check to see whether the data in each sample is normally distributed. If the departure from normality is not too severe then you can use the t test. Even relatively symmetric data should be ok. If the data is very skewed then you can use the Mann-Whitney test provided the distributions of the two samples are relatively similar (even if skewed).
      Charles

      Reply
      • Very helpful — thanks! Follow-up question: If the data ARE skewed (i.e., not normally distributed) and I use Mann-Whitney, would I use it on the raw data, or on the data that are transformed by whichever transformation brings their distribution closest to normality?

        Reply
  44. Hello Charles,

    I am having trouble finding a relatively recent academic reference to support the assertion that ‘MANOVA is not very sensitive to violations of multivariate normality provided there aren’t any (or at least many) outliers’. I have a large sample (n=955) with no outliers, but the assumption of multivariate normality is violated. I can proceed with the analyses if I can cite evidence that the MANOVA will remain robust. Could you recommend a citation?

    Kind regards,
    Sharon

    Reply

Leave a Comment