Hypothesis Testing

Central to statistical analysis is the notion of hypothesis testing. We now review hypothesis testing (via null and alternative hypotheses), as well as consider the related topics of confidence intervals, effect size, statistical power, and sample size requirements.

Concepts introduced in this part of the website will seem somewhat abstract, but they will become clearer as more concrete examples are described on other web pages.

Topics

References

Howell, D. C. (2010) Statistical methods for psychology (7th ed.). Wadsworth, Cengage Learning.
https://labs.la.utexas.edu/gilden/files/2016/05/Statistics-Text.pdf

Zar. J. H. (2010) Biostatistical analysis 5th Ed. Pearson
https://bayesmath.com/wp-content/uploads/2021/05/Jerrold-H.-Zar-Biostatistical-Analysis-5th-Edition-Prentice-Hall-2009.pdf

10 thoughts on “Hypothesis Testing”

  1. Hi Charles,
    I’m trying to do a hypothesis test on whether the means of the monthly returns from two different investment strategies are equal. On the one hand , given their correlation due to exposure to the same underlying market risk, a t-test for pairwise observations would make sense. On the other hand (i) the two samples are of different sizes and (ii) I’m not sure I can assume the variances are equal (or do I need to test for this first before deciding on the form of means test?). So maybe using a t-test using a pooled estimate of population variance would make sense. Any advice you can provide would be greatly appreciated. Thanks.

    Reply
    • Hello Jeff,
      For the reasons you said, a pairwise t test probably makes sense. Since you have monthly data, I would think that you would have samples of equal size. If not, you probably have some missing data in one or both samples for some months. If this is the case, you probably just need to drop those months for which there is missing data. I am assuming here that there isn’t a pattern for the missing data and this is completely random.
      Charles

      Reply
  2. Charles

    Because the researcher wants not publish trash results, to warrant that his find is clearly different from the current null hypothesis, H0, yet found in literature, he, wisely, does choose a very small alpha (5%, even 1%) in order that the type I error be appropriate.
    Although the usual jargon “not significant result” hardly can be synonymous of a Null Hypothesis acceptance. It is much better, IMO, to say that “there is no sufficient evidence” to reject H0. Accordingly, would be preferable “no-rejection interval”, instead of the current “acceptance interval”.
    Are you, Charles, so kind to comment?

    Luis

    Reply
  3. Charles

    It shouldn’t be taken literally the Parametric Null Hypothesis sign H0: p=0 (or whatever value). In fact accomplishing the calculations we admit that it is true. . . However in case the not significant result we cannot go beyond the statement that there is no sufficient evidence to reject the Null Hypothesis: to say that we accept is structurally an error, even worse to state the Null is true.
    On contrary we fall into the catastrophic conclusion ( J. Cohen *) that the Null Hypotheses is impossible to be found because even a last decimal could be different from the assumed value.

    A silly argument to try to invalidate the NHST, say.

    Luis

    Reply
  4. Charles

    Yes, of course, you are right. My fragmentary note was only to stress the evidence of the NHST “necessarily” character. I bet that are yet people persuaded that a “no-significant” result implies the Null Acceptance. I had heard “barbaric” conclusions about these issues, like, for example:
    ____The significance tests are a completely foolish exercise because the Nullity never occurs. That is, we pose H0: p=p0 and we know that this condition is impossible because, likely, the Parameter Population will differ from proposed p0 at least from one decimal place.
    My thought:
    One cannot think H0: p=p0 as an algebraic equality. My (humble) interpretation is something like this: We try to obtain an evidence that “ Ha: p not equal to p0” is strongly unlike. Performing the test (supposing the Null True) if they fall outside the “rejection interval” we conclude that there is not sufficient evidence to reject H0. Simply, we are not allowed, at all, to state H0 true. In fact to the no rejection we deserve a high probability, usually 95%, therefore, we, deliberately, abandon the intention to say anything about the Null truthfulness.

    And so on. I think that a great amount of the Real Statistics readers are well aware that one is dealing with likeliness, never with certainty: NHST is a true game.

    Charles, I beg your pardon for the babbling,

    Luis

    Reply
  5. Dear Charles

    What we really get through NHST?

    Logic tells us that to achieve a conclusion since a sufficient set of precondition is fulfilled then the conclusion is unequivocally proved. On contrary necessary conditions must evidently to occur but nor withstand are unable, by itself, to prove the conclusion we want to check.
    Unfortunately, NHST´s, are such that a no-significant result does not means, at all, that the null hypothesis is true, by other words, it must occur as a precondition, but is insufficient to state the truthfulness of the Null. Therefore, following Neyman – Pearson Theory, if we want to achieve an indisputable choice – H0 or Ha, Null or Alternative Hypothesis – we must impose a rather odd condition such that there not exists a third alternative.

    Luis

    Reply
    • Luis,
      What you get via the NHST approach is the probability that the observed data could occur given that the null hypothesis is true (i.e. a conditional probability). This approach is a bit disappointing for those of us who would simply prefer to know the probability that the null hypothesis is true. In any case, the NHST approach is the one that is commonly used (although the Baysians look at things a little differently).
      Charles.

      Reply

Leave a Comment