Two Sample t-Test Proof

Property 1: Let  and ȳ be the sample means of two sets of data of size nx and ny respectively. If x and y are normal, or nx and ny are sufficiently large for the Central Limit Theorem to hold, and x and y have the same variance, then the random variable

image721

has distribution T(nx + ny – 2) where

image723

Proof: Let σ be the common standard deviation of x and y. Then  – ȳ has a normal distribution with mean µx – µy and standard deviation

image3392

Defining z as follows, we know that z has distribution N(0, 1).

image3393

We also know that (n - 1) s_x^2/\sigma^2 has distribution χ2(n– 1) and (n - 1) s_y^2/\sigma^2 has distribution χ2(n– 1), and so

has distribution χ2(n+ n– 2).

Defining t = z\sqrt{m}/u, where m = n+ n– 2, it follows by Property A of Basic Concepts of t Distribution that t has distribution T(m).

where s is defined as in the statement of the property.

7 thoughts on “Two Sample t-Test Proof”

  1. Dear Charles,
    Is there any way to do t-test when the variance of one sample is unknown?
    I want to compare the mean of my sample (collected in 2020) with the mean in another sample (from the same population, i.e. country) twenty years back by another colleague who hasn’t reported the variance in their paper.

    I appreciate your advice.

    Regards,
    Ali

    Reply
    • Ali,
      You need the variance to perform a two-sample t-test. You could perform a one-sample t-test against the mean of the sample whose variance you are missing. This would work provided that the mean of this sample (i.e. the sample whose variance you don’t have) is a good estimate of the population mean. Not great, but it may be all that you have.
      Charles

      Reply
    • Thank you for catching this error. I have now corrected the mistake.
      I appreciate your helping to make the website more accurate and easy to follow.
      Charles

      Reply
  2. Hi Charls,

    Can u guide me : On interpreting the p value score and result for both hypothesis and that too for same case
    Way1 : H0 = there is no change in awareness level of TG
    Ha = there is significant change in awareness level of TG

    Way2 : H0 = Awareness level has increased among TG post campaign
    Ha = There is no change in awareness level How 2 Null hypothesis for same case

    Can u plz suggest how two different hypothesis for same case can make change in result and interpretation. Please suggest

    Reply
    • Hello,
      It sounds like Way1 will require a two-tailed test (in Ha, awareness can increase or decrease), while Way2 requires a one-tailed test (you seem to be ruling out the case where aware decreases).
      Charles

      Reply

Leave a Comment