Unplanned Comparisons

Introduction

A number of unplanned comparisons are available. We will review the most useful tests here. Although many tests are available, it is important to avoid the temptation to perform multiple tests and select the results that are most favorable to whatever you are trying to prove. You should select one of these tests, and stick with it, possibly using another test to gain some further insight. Fortunately, there are criteria to follow that make one test more appropriate than another. We will review these issues below.

Since multiple (usually pairwise) comparisons are performed, a key objective of all these tests is to control familywise error. Although the Bonferroni and Dunn/Sidák correction factors can be used, since we are considering unplanned tests, we must assume that all pairwise tests will be made (or at least taken into account). For k groups, this results in m = C(k, 2) = k(k–1)/2 tests. For an experiment-wise error of α we need to use α/m as the alpha for each test (Bonferroni) or 1 – (1 – α)1/m (Dunn/ Sidák). This makes these tests too conservative.

More useful tests are Tukey’s HSD and REGWQ. These tests are designed only for pairwise comparisons (i.e. no complex contrasts). We also describe extensions to Tukey’s HSD test (Tukey-Kramer and Games and Howell) where the sample sizes or variances are unequal. We also describe the Scheffé test, which can be used for non-pairwise comparisons.

General guidelines

  • Tukey’s HSD test is usually the safe choice. It is a good choice for comparing large numbers of means
  • REGWQ test is even better (i.e. has more power) for comparing all pairs of means, but should not be used when group sizes are different
  • Benjamini-Hochberg is best when sample sizes are very different or where there are a very large number of tests
  • Games-Howell is useful when uncertain about whether population variances are equivalent.
  • Dunnett’s test is useful when you only want to make comparisons with a single control group
  • Hsu’s MCB test is useful when you only want to make comparisons with the group with either the highest or lowest mean
  • Tukey’s LSD test has high power but at the cost of not protecting against familywise test error

Topics

Click on any of the following post-hoc tests for further information:

Reference

Howell, D. C. (2010) Statistical methods for psychology (7th ed.). Wadsworth, Cengage Learning.
https://labs.la.utexas.edu/gilden/files/2016/05/Statistics-Text.pdf

100 thoughts on “Unplanned Comparisons”

  1. Hello, Charles.
    Although it’s not recommended to use multiple tests at the same time, I tested Real Statistics release 8.8 for One Factor Anova, selecting almost all options in the Data Analysis Tool: Omnibus test options (ANOVA, Brown-Forsythe, Kruskall-Wallis, Random Factor, Welch’s, Levene’s), ANOVA’s Follow Up Options (contrasts, Dunnett’s, HSU MCB MAX, Tukey HSD, Scheffe, Games-Howell, REG-WQ, Pairwise t-test), Kruskall-Wallis Follow Up Options (contrasts, Steel, Pairwise MW, Nemenyi, Schaich-Hamerle, Pairwise MW exact, Dunn, Conover), Alpha correction for contrasts (Bonferroni #1). I obtained results without computation problems, but there’s a bug in the presentation of data: Pairwise t test table is “mounted” on HSU MCB’S Test table and D-TEST (MAX) table, modifying final results, showing a “#¡VALOR!” error in D-TEST (MAX) table cells, (Excel 365, Spanish version).
    It seems like there’s only a location reference problem for Pairwise t test table that, if corrected, it could solve the bug.
    Here’s the data I used for testing Real Statistics Data Analysis Tool:

    Nivel de Autoestima, 10-12 años, MUJERES (n1=20)
    34
    32
    35
    41
    38
    29
    26
    52
    41
    47
    47
    28
    37
    44
    42
    41
    51
    40
    40
    33

    Nivel de Autoestima, 10-12 años, HOMBRES (n2=25)
    52
    52
    45
    24
    38
    45
    31
    52
    54
    53
    34
    50
    53
    48
    47
    57
    52
    52
    43
    38
    43
    56
    47
    43
    53

    Nivel de Autoestima, 13-16 años, MUJERES (n3=26)
    43
    44
    33
    53
    32
    41
    38
    32
    38
    57
    35
    29
    31
    56
    36
    42
    50
    44
    39
    30
    37
    55
    56
    29
    51
    38

    Nivel de Autoestima, 13-16 años, HOMBRES (n4=28)
    38
    45
    46
    48
    40
    39
    38
    54
    35
    48
    44
    53
    34
    48
    49
    43
    44
    54
    43
    50
    56
    54
    45
    54
    32
    49
    57
    44

    I’ll be waiting for your comments.
    Thank you.

    William Agurto.

    Reply
    • William,
      Thanks for catching this error. I will be issuing a bug-fix release in a day or two that will correct this problem.
      Thanks again for your help in improving the quality of the Real Statistics software.
      Charles

      Reply
      • Dear Charles,

        I had exactly that question, and the reason for using SNK instead of Tukey, would be because SNK test is deemed to have more statistical power. I understand that the reason for having more statistical power is because SNK is a sequential test, and therefore not all pairwise comparisons are tested. Could you please clarify that? Thanks in advance,

        Wilson

        Reply
  2. Hello. Good day.
    Thanks for the great website and resources.
    May I just ask what is the most appropriate statistical analysis if I have to compare one group (the group that received peer tutoring sessions) and 4 other groups (groups that were not peer tutored)?
    I want to determine if the peer tutored students performed better than those who were not peer tutored. The group sizes are the following:

    Peer Tutored: 45
    Non-Peer Tutored A: 46
    Non-Peer Tutored B: 46
    Non-Peer Tutored C: 48
    Non-Peer Tutored D: 48

    Can you please also help me how to do the statistical analysis using Real Statistics?

    Thank you very much.
    Mark

    Reply
  3. If I have a lot of test lines, lets say 250. I have done ANOVA but then I want to proceed with a Post Hoc test on these 250 test lines? Is there a way I can do Post Hoc on these many lines?

    Reply
    • Charo,
      You should be able to use these post-hoc tests even with 250 groups (I presume that you are referring to 250 groups or treatments), but this is a lot of groups and so the familywise error correction will be very big.
      In order to get reasonable results, you may consider using pairwise t tests and the Benjamini-Hochberg method.
      Charles

      Reply
  4. Hi, Charles, I want to know the value ranges of k, df, α, and p in QCRIT(k, df, α) and QINV(p, k, df). When I imput such as QCRIT(40, 300, 0.1) or QINV(0.5, 400, 2000), the return values are missing, why? Thank you very much!

    Reply
    • Jian-Hua,
      The QCRIT function is based on the table of critical values. No tables are currently supported which take the values k = 40 and alpha = .10, and so =QCRIT(40,300,.1) does not return a value. Note that =QCRIT(40,300,.05) is supported. Also =QINV(.1,40,300) is supported and should approximate the value for =QCRIT(40,300,.1). This value is 5.2469. Note that the actual critical table value for alpha = .1 and k = 40 is 5.313 for df = 120 and 5.202 for df = infinity.
      QDIST and QINV don’t return values in extreme cases. =QDIST(5.892,400,2000) does return the value .5, and so theoretically =QINV(.5,400,2000) should take the value .5. I have checked this with another program and see that the inverse value is 5.809 instead, which may indicate that 5.892 is a reasonable estimate.
      Charles

      Reply
  5. Hi Charles,
    I have run a Games-Howell on my data since sample sizes are different and homogeneity of variance assumptions are not met. There is one comparison that I just can’t make sense of and am wondering if you can help. I hope you don’t mind me asking another question.

    Here are 6 groups, followed by the mean and then the sample size:
    BT, 0.11, 20
    BL, 0.44, 9
    MS, 0.21, 3
    UCF, 0.19, 4
    HW , 0.14, 2

    Here are the comparisons followed by the p-value:
    BT:BL, .06
    BT:MS, .72
    BT:UCF, .01
    BT:HW, .99

    I am confused about why all of these comparisons are found to be similar, except for the BT:UCF. The means of these two groups are closer together than some of the other groups which are found to be similar.

    Thank you for your time,
    Alissa

    Reply
    • Alissa,
      p-value = .01 indicates that there is statistically significant difference between BT and UCF (since .01 <.05) assuming that you are employing an alpha = .05 significance level. None of the other comparisons shows a significant difference, although BT:BL is close. Charles

      Reply
      • Hi Charles,

        Thank you for your response. I am trying to figure out why BT and UCF are not similar since all the other comparisons are even though some of the means are further apart than BT and UCF?

        BT has the lowest mean, but is found similar to BL with the highest mean, so why is it not similar to UCF which also has a lower mean?

        Thank you,
        Alissa

        Reply
  6. Hello Charles,

    Finally, I think that Dunnett’s test is more appropiate for my data. Is the test running one tailed or two tailed comparisons?

    Thanks in advance

    Reply
  7. Hello Charles,

    First, BIG thank you for this tool and all the explanations given. I have a question for you (sorry, I’m newbe to stat) about how to interpret the output for the Tukey’s test: for a given alpha (0.05), pairwise comparations a-b, a-c, a-d, etc have p-values of 0.9, 0.03, 0.002, etc respectively. So, can I represent or say that a-c are significant at * p<0.05 and a-d at ** p<0.01?

    Thanks in advance

    Reply
  8. Hello Charles

    very useful all the information that you publish, this made me understand many things.
    However I want to ask you about the value of “c” in the table that Games Howell throws, since, for my data that column is empty, therefore all the cells below that use the value “c” give me error.

    thank you very much

    Reply
    • Ana,
      You need to manually fill in this column by placing a +1 in one of the cells and a -1 in another of the cells. These two cells determine which pairwise comparison you want to make. To make multiple such comparisons, just change which cells get the +1 and -1.
      Charles

      Reply
  9. Dear Charles,

    after a one-way unbalanced ANOVA on four groups, I did post-hoc Scheffé test with experiment-wise alfa = 0.05.
    This because I want to compare also group of means, not only pairwise comparisons.
    But after applying Scheffé to a pair of the above means with the latest Real Statistics pack, I obtain the following result:
    F-stat = 7.23, F-crit = 8.12, p-value = 0.000215, lower = -38.8, upper = 1.12
    and for another pair of means:
    F-stat = 3.72, F-crit = 8.12, p-value = 0.014, lower = -25.9, upper = 4.98

    So, while both comparisons are not significant, because F-stat < F-crit, p-values look like they were significant! I can not understand this, could you please explain me?

    Thank you very much
    Best Regards
    Piero

    Reply
      • Charles,

        I understand this. But in my design I have 4 groups (A,B,C,D) of different sizes, and I need all pairwise comparisons (I can use Tukey-Kramer for this), but I should test also:
        A vs (B+C+D)
        This latter can’t be done with Tukey-Kramer, for this reason I was thinking to use Scheffé or maybe Games-Howell, I don’t know which is better for unbalanced design.

        Do you think it’s fine to present in my study Tukey-Kramer results for pairwise comparisons, but to use a different test for the latter more complex contrast?

        Thank you very much for all your suggestions

        Regards
        Piero

        Reply
        • Piero,
          Games-Howell is useful when the variances of the groups are different. If the sample sizes are different but the homogeneity of variances assumption holds then you can use Tukey-Kramer instead.
          Based on the info you have provided, Tukey-Kramer would probably be the correct choice if you only had pairwise comparisons.
          If you have a lot of non-pairwise comparisons, then Scheffe’ is probably the best choice. If you have many pairwise comparisons and only one non-pairwise comparison you might consider using Tukey-Kramer for the pairwise comparisons and Bonferroni/Contrasts for the non-pairwise comparisons.
          Charles

          Reply
  10. Hi Charles,

    Thanks so much for the amazing resources you have created, between all of the web pages and the RealStats Excel package. I’m following along with this example using the RealStats Excel functions (in Excel 2016 on a Mac), and for some reason, “=QCRIT(4,44,0.05,2)” returns a different value for me than what you show on this page. (3.7767272 vs. 3.7775). Any ideas about this?

    Reply
    • Hi Stephanie,
      =QCRIT(4,44,.05,2,FALSE) has value 3.7775, while =QCRIT(4,44,.05,2) has the value you calculated. The first formula uses linear interpolation, while the second uses harmonic interpolation (the default).
      I have now updated the webpage to make this clearer. Thanks for bringing this to my attention. I appreciate your help in improving the website.
      Charles

      Reply
  11. Thanks for these programs! I’ve installed the program for Mac 2016 and I’m doing a One-way ANOVA with Tukey HSD. Regardless of whether the groups have the same N value or the individual cell values, the output returns a Q-test with std err, lower, upper, x-crit, and Cohen d of zero. The df and q-crit are reported. Any reason why this is or what I’m doing wrong?

    Reply
  12. Hi when you say :Since the difference between the means for women taking the drug and women in the control group is 5.83 – 3.83 = 1.75 and 1.75 is smaller than 1.8046, we conclude that the difference is not significant

    What does that imply ?is thit different ? And how to select the best combination between this six man.woman categories ?

    Reply
    • Sam,
      Not significant means that you cannot conclude that the population means are different.
      If you are using a post hoc test such as Tukey HSD, then you can do any combination of categories that are interesting to you (without increasing the experimentwise error rate).
      Charles

      Reply
  13. Hi Charles,
    Thank you for the excellent program for everybody.
    When I read the outcomes by Scheffé’s test using RealStats.xlam, the P value was lower than other program. The cell for P value by Scheffé’s test did not include the coefficient of 1/dfB. (Sorry for not using a small capital for B.) When the F value was multiplied by this coefficient, the P value obtained was exactly same as those by other programs. I would be happy, if you could have a chance to check.
    Best regards,
    Kats

    Reply
  14. Hi Charles,

    I have just discovered your add-in and website and am pleasantly surprised that it’s so well detailed. I was playing around with the add-in and am confused by the output I see for the Tukey HSD after the one-way ANOVA. I don’t see the pairwise significance table shown in “Fig 1” or the table shown in “Fig 2” on the Unplanned Comparisons webpage. I only see the table shown in “Fig 3”. I guess another way of describing what I’m seeing (using your Worksheet example #2), is that I don’t see the Tukey’s HSD table, but only the Tukey HSD/Kramer table. Am I missing something here? Thanks.

    Reply
    • Jamie,
      In the current implementation (as shown in Figure 3), you need to make the pairwise comparisons one at a time. For whichever pairwise comparisons you desire, you place a 1 and a -1 in the shaded area corresponding to the two pairs you want to compare.
      Charles

      Reply
      • Thanks for the quick reply Charles. I see now how it works though I’m afraid I don’t find it very intuitive. As a friendly suggestion, it would be nicer to have the pairwise table generated automatically as in Fig. 1.

        On a different note, I have noticed that when I close Excel and reopen, that RealStats no longer appears under the Add-In menu, but when I go to the Add-ins options it is still checked. Then I need to uncheck, hit ok, and then recheck to reload it…and then it works again. Perhaps this is a bug? I’m using Office 2013 and don’t seem to have this trouble with other add-ins. Cheers.

        Reply
  15. Dear Charles,

    Your website really opens my mind! I tried analysing my data with Weich’s followed by the Games and Howell Test but there was no show-up of P-value. Did I miss something?Is there a formula like QDist to calculate the P-value in this test?

    Regards,
    Long

    Reply
    • In the latest release of the Real Statistics software (Rel 4.11) the p-value is given for the Games-Howell test. In any case, you can use the QDIST function.
      Charles

      Reply
  16. Good morning,
    Since the tool for the Dunnett’s test shows only if there is significance or not (with “yes” or “no”), how can I know the value of the p-values in a Dunnett’s test? Is there a way to do that?
    Thank you very much!
    Kind regards
    Davide

    Reply
    • Davide,

      The best I can do at present is make an estimate based on the Table of Critical Values for Dunnett’s Test as shown in
      Dunnett’s Test Table

      E.g. suppose k = 10 and df = 20. From the table I see that the critical values at .01, .05 and .10 are respectively 3.694, 2.946 and 2.600. Thus if the test statistic is 3.2, then I know that the p-value is somewhere between .01 and .05; a rough estimate is something like .03. I can make this more precise by using the DCRIT function. In fact, I see that DCRIT(10,20,.03) = 3.070667, which is lower than 3.2. Thus I need a value lower than .03 (but higher than .01). I try .02 next. Since DCRIT(10,20,.02) = 3.2265, which is higher than 3.2, I need a value higher than .02 (but lower than .03).

      I can iterate this “divide and conqueror” approach to whatever degree of accuracy I desire. I settle for .0212 since DCRIT(10,20,.0212) = 3.200038. Note that this approach uses harmonic interpolation, as explained on the webpage
      Interpolation

      I had planned to create a Real Statistics function that would do all of this for you for the previous release, but I ran out of time. I will likely put it in the next release.

      Charles

      Reply
  17. Dear Porfessor,

    In other page I write a sugestions for write the p-value.
    But I need calculate…

    And I try follow what you write in this page and not obtain the values.

    Example:
    q-stat df
    -18.74358529 4285

    ah… the sample/gropups are 7.
    I try use and not obtain any p-value. Say there are a error in formula.
    It seems to me that my values are very big because if I rduce the df I obtain values.

    Thanks
    José

    Reply
  18. Brilliant site, thank you! I typed in your data and ran the add-in function for this; however, the DCRIT function, namely =DCRIT(COUNT(I3:I6),H10,L1) in my case, returns #VALUE!. I cannot get the function to work with different parameters either. Note, other RealStats functions are working in Excel during this instance as well. Thoughts?

    Reply
    • Dustin,
      Glad you like the site.
      The formula on the referenced webpage is =DCRIT(COUNT(I4:I7),H11,L2). Your formula should work provided you have shifted all the data one row up. If so, then I suggest that you send me an Excel file with your data so that I can see what went wrong.
      Charles

      Reply
      • I had the same problem.
        Also with my data the function DCRIT results in an error. But I found out that it is because the size of my dataset (i.e. “k”) is bigger than 240 (it is 294) and so the formula doesn’t recognize it as a number of the Dunnett’s table and return an error. Eventually, I put manually the right number (i.e. the number of df lower, 240) and the formula works as it should 🙂

        Anyway, it’s an awesome tools! Thank you very much!

        Have a nice day

        Reply
        • Davide,
          Last week I also came across this error in the DCRIT function. It will be corrected in the next release of the software, which should be available in the next couple of days.
          Charles

          Reply
  19. Dear Charles,

    As I’m a beginner in all of this, I’m a bit confused about the output of the Tukey HSD test. How can I know at which level of significance two groups are different? (based on the P-values, I want to add the appropriate ‘*, **, ***’ symbols to my charts. However, I can’t find the P-values associated with each pairwise comparison)

    Kind regards

    Reply
    • Francis,
      You can calculate the p-value by using the QDIST function. E.g. for Figure 3, p-value = =QDIST(3.663207,4,44) = .060185.
      Charles

      Reply
      • I am showing a significant between group difference via ANOVA and a significant difference between 2 of my 3 groups on post hoc Tukey HSD. However, I do not see a p-value in the generated output chart.
        When I try the QDIST function mentioned above “=QDIST(q-stat,# of levels,df)”, the formula outputs “1”.
        Can anybody help with this?

        Reply
        • Austin,
          If you are using one of the latest releases of Real Statistics, you should see a p-value in the output. This won’t be the case if you are using the Mac version.
          If the QDIST function outputs 1, then p-value = 1 and you clealry don’t have a significant result.
          Charles

          Reply
  20. Hello,

    I’ve been trying to use both contrast and Tukey HSD for One way Anova.Then I got different result for both comparison analyses. Contrast give sig. result while Tukey HSD not on a pairwise comparison in my analysis.Can you enlighten me why this happen?

    Reply
    • Fahmi,
      It is not so uncommon to find that different tests for the same situation give conflicting results. Each test uses its own set of assumptions and has its own advantages and disadvantages.
      First and foremost, you need to make sure that each test is suitable for the analysis that you are undertaking (and that the assumptions of that test are met). In your situation, Tukey HSD is used for multiple pairwise comparisons, while contrasts can be used for comparisons that are not pairwise. While you can use Tukey’s HSD to perform multiple pairwise comparisons, you cannot use it it to compare more than two variables at a time (although you can use contrasts for this).
      Charles

      Reply
    • Felix,
      The version of the test in the Real Statistics Resource pack assumes that all sizes of all the groups are equal. The test can be modified to handle unequal sample sizes. This is done in the same manner as Tukey-Krammer modifies the Tukey HSD test to handle unequal sample sizes.
      Charles

      Reply
  21. Hi Charles,
    I’ve run several one-way ANOVAs with Tukey’s HSD and consistently come up with the problem of the q statistic coming up with #DIV/0! in the spreadsheet. I’m pretty confident in my setup so I’m not sure why. Here’s some dummy data that gives this result, if needed:
    all gut tentacle Neo
    4 1 3 0
    3 2 1 2
    5 3 2 1
    3 2 1 1
    5 3 2 2
    10 5 4 2
    2 1 2 2
    3 1 2 1

    P value is significant and visually I can see where the significant differences are, but can’t seem to convince the formula to report the q-stat correctly.

    Reply
    • Anna,
      You need to fill in the c column in the output, by placing a +1 in any one of the highlighted cells and a -1 in another (these are the contrast coefficients). E.g. in your problem, if you place +1 next to Group 1 and -1 next to Group 3, you would compare Group 1 with Group 3. You cam compare any pair of groups without inflating the experimentwise error (which is the point of Tukey HSD).
      Charles

      Reply
  22. Charles,
    I want to write my own Excel program for computing the critical q value in Tukey’s test.
    Problem is that I cannot find the statistic q in the formulas provided by Excell.
    Finding q-crit via q-crit = (2)exp0.5 * t-crit doesn’t seem to work to me since q=f(alpha, k, dfW) and t=f(alpha, dfW).
    I went to your the Studentized Range q Table which gave of course al the values necessary to perform Tukey’s HSD test, but I feel this is not being as practical as having it formulated on the same Excel sheet together with Anova and ICC computations. Could you help me to find a solution? I realize that your anwer may very well be that working with the Real Statistics Data Analysis Tool would be a far better solution, but I want to do it my own way…
    Thank you very much, you have always been a great support.
    Erik

    Reply
  23. Thanks for this fantastic resource. I have a question about confidence intervals for differences between means when variances of groups are heterogenous – that is, cases where you’d be using Welch’s test and the Games and Howell test. Would that CI be calculated in the standard way from the standard error and t-statistic, but with the SE and degrees of freedom for t taken from the Games and Howell calculations? Or does the difference in variances introduce an additional wrinkle that needs to be considered?

    Thanks again,
    Alistair

    Reply
      • Thank you. I also had a question about a possible typo. In the introduction to this section, you write “Games-Howell is useful when uncertain about whether population variances are equivalent.”, but in the Games-Howell section, the first sentence reads “A better alternative to Tukey-Kramer when sample sizes are unequal is Games and Howell. ” I’m guessing the second of those is also supposed to say “variances” instead of “sample sizes,” but just wanted to double-check.

        Reply
        • Alistair,
          You are correct. I had corrected this on my offline copy of the site, but not on the website itself. Thanks very much for catching this error and helping to improve the website. Much appreciated.
          I have now made the correction to the referenced webpage.
          Charles

          Reply
  24. Evening,Charles.
    Thanks for your wonderful website.
    It helps me a lot for my final project problem.
    However, I have a question for you regarding how to perform a table like Figure 1.
    Could you please tell me how to create the table in a great detail?

    Thanks!

    Reply
  25. Hi, thanks for all this useful help and resources. I have tried to match the tukey hsd results with minitab output and by formula from books. With the books i have almost the same std error and confidence intervals that real stats output. I dont understand quite well the c^2/n part .

    With minitab output they use the t test version of tukey and the formulas for the family ratio error and standard error of diferences yield different results from the formulas used in your website for t tukey hsd. Minitab say they use 0,007 for individual error alpha if you go for 0,05 family error rate in the case of 10 comparisons. but it doesnt follow the 1-(1-alpha)^c family error formula or explain how they get that number.
    This are the numbers Iam working with, this are from 3 plants workers ages. first row are labels not data.
    1 2 3
    29 32 25
    27 33 24
    30 31 24
    27 34 25
    28 30 26
    Help!! :)) thank you very much.

    Reply
    • Marcy,

      I don’t use Minitab so that it is difficult for me to comment about what results they get. I have checked my results with another popular software product and the results for your data are exactly the same.

      Regarding .007 for individual alpha for familywise alpha of .05 with 10 comparisons, this doesn’t seem to have anything to do with the data that you have included in your comment since you only have 3 groups.

      Charles

      Reply
  26. Hello,

    Above it says:”Example 1: Analyze the data from Example 1 of Confidence Interval for ANOVA using Tukey’s HSD test to compare the population means of women taking the drug and the control group taking the placebo.”

    However if I follow the link it says:”Example 1: Find the confidence intervals for each of the methods in Example 3 of Basic Concepts for ANOVA.”

    Following that link it says:”Example 3: Repeat the analysis for Example 2 where the last participant in group 1 and the last two participants in group 4 leave the study before their reading tests were recorded.”

    Example 2 on the same page says:”Example 2: A school district uses four different methods of teaching their students how to read and wants to find out if there is any significant difference between the reading scores achieved using the four methods. It creates a sample of 8 students for each of the four methods.

    I cannot find any examples comparing women taking drugs vs. placebos or that has 44 degrees of freedom. Do you mean this page?

    https://real-statistics.com/one-way-analysis-of-variance-anova/planned-comparisons/

    Reply
    • David,
      Yes, the example should refer to Example 3 on the Planned Comparisons webpage. I have just changed the link.
      Thanks for finding this mistake.
      Charles

      Reply
  27. Thanks for the great website!

    How would I apply the Tukey HSD test to a two factor ANOVA? I’ve successfully used your add-on to use the Tukey following a single way ANOVA, but am at a loss on how to get it to apply to the other ANOVAs.

    Thank you!

    Reply
  28. Charles:

    There’s a little bug in Real Statistics 3.2 when using Single Factor Anova (first option of Analysis of Variance menu):
    When I used 4 samples (different size: 19, 17, 15 and 18), I got an error in the Random Factor Table: It seems there’s an error calculating the estimate group variance. For my example, that value is negative. For that reason I obtained “#NUM!” in lower and upper bounds for the mean. Perhaps is necessary to use the ABS Excel function in the numerator for getting a positive number?

    The values I use for my example are the following.

    Thank you.

    William.

    G1 G2 G3 G4
    78 81 83 47
    45 59 32 90
    77 58 33 37
    2 48 84 15
    45 43 35 28
    46 98 68 5
    30 16 56 84
    96 6 21 36
    44 99 2 2
    31 18 31 4
    72 37 86 49
    28 6 39 36
    29 2 17 2
    68 67 53 65
    51 6 24 77
    15 50 79
    99 84 86
    73 94
    78

    Reply
    • William,
      It is quite possible for the value to be negative, in which case the remaining values will be meaningless. I will explain this once I update the webpage. The value for df can also be less than one, in which case there will be an error since Excel’s chi-square function will round df down to zero.
      Charles

      Reply
      • Thank you for your answer, Charles. I didn’t know that value could be negative and df could be less than one.

        Regards.

        William.

        Reply
        • William,
          Obviously in these cases the estimates generated by the formulas are not particularly useful since a negative variance cannot exist and a df of less than 1 (even if not rounded off) will produce a huge confidence interval.
          Charles

          Reply
  29. Hello,
    Thank you for this excellent tool to further my statistic ability when using excel. I’m currently experiencing a small difficulty in performing my own analysis, however.

    When I analyze my data set, the shaded grey column labeled “c” in the Tukey output is empty. Because of this, a lot of the remaining outputs cannot be calculated.

    additionally, there is no column for individual “yes” or “no” on whether each group is different. Based on the ANOVA p value, there is significant difference, but I’m unable to discern which groups.

    Thank you,
    Addie

    Reply
  30. Is there any way I tell it to read replicates within a treatment across a row instead of down columns (as in the option to group by row or column in excel’s regular ANOVA analysis) without transposing the data (i.e. all my data has the dependent variables listed down the columns and replicate measurements across the same row)?

    Reply
    • David,
      Unfortunately the Real Statistics ANOVA data analysis tool currently only accepts two formats: the format that you referenced and the standard format (similar to that used by tools such as SPSS). The only solution I can think of is to transpose the data.
      Charles

      Reply
  31. Ok. Here is my question again, but better asked (I hope). Yes, Tukey HSD tells me whether there is a significant difference in a pair of means. Now I have multiple yes/no answers. How can I now group them according to their differences? Example: group A (treatments 1 and 2) is significantly different from group B (treatments 3 and 4, 5, and 6). How can I tell if there is a group C (treatments 5 and 6) and/or more groups of statistically different treatments just by looking at the yes/no answers? My goal is to make a table with the means of the treatments from largest to smallest in a row in excel. I would like to underline the treatments that are within the same group. How can I do this?

    Reply
    • I just figured it out! I wasn’t showing any differences because all of my contrast values were 1 instead of 1 and -1. Can you explain why we need to use these values, how we determine which we should use, and how to come up with the value?

      Reply
      • Elizabeth,

        I am pleased that you figured this out yourself. How you set the contrast values depends on what you want to test. What is important is that the sum of the contrast values is equal to 0 for any single test. Any variable which is not part of the analysis gets a value of 0.

        For example, if you want to compare variable x2 with variable x4 just set the 2nd contrast value to 1 and the 4th to -1 (or vice versa). If you want to compare x1 with the average of the values of x2 and x3, just set the 1st contrast to 1 and the 2nd and 3rd to -.5.

        Suppose x1 represents men who live in the US, x2 represents women who live in the US, x3 represents men who live in Canada and x4 represents women who live in Canada. If you want to compare men vs. women set x1 = x3 = .5 and x2 = x4 = -.5 (alternatively you could set x1 = x3 = -1 and x2 = x4 = +1 and get the same result). If instead you want to compare people in the US with Canadians set x1 = x2 = .5 and 3 = x4 = -.5.

        I tend to use contrast values so that the positive contrasts add up to 1 and the negative contrasts add up to -1, but this isn’t essential.

        Charles

        Reply

Leave a Comment