Noncentral t Distribution

Basic Concepts

The t distribution characterizes how the t-test statistic is distributed when the null hypothesis is assumed to be true. The noncentral t distribution instead shows how the t-test statistic is distributed when the alternative hypothesis is assumed to be true (i.e. when the null hypothesis is assumed to be false). As such it is useful in calculating the statistical power and minimum sample size of the t-tests.

Definition

Definition 1: The noncentral t distribution, abbreviated as T(k,δ) has the following cumulative distribution function F(t), written as Fk,δ(t) when necessary, where k = the degrees of freedom and δ = the noncentrality parameter.

image7302

when t ≥ 0, where Φ is the cumulative distribution function of the standard normal distribution, i.e.

Φ(z) = NORM.S.DIST(z, TRUE)

and Ir(a,b) is the cumulative distribution function of the beta distribution

Ir(a,b)= BETA.DIST(r,a,b, TRUE)

where
image7303

Algorithm for cdf

Since the pdf of the Poisson distribution with mean  can be expressed as

Poisson distribution pdfwe note thatPoisson pdf

and so we can express the cdf of the noncentral t distribution as

Noncentral t distribution cdf

where Γ(k) is the gamma function. Since Γ(m+1)/Γ(m+3/2) can be expressed by the formula =EXP(GAMMALN(m+1)-GAMMALN(m+3/2)), the cdf of the noncentral t distribution can be expressed in Excel as a finite sum of terms using POISSON.DIST, BETA.DIST, NORM.S.DIST, and GAMMALN.

The more terms used in the finite sum, the better the precision, although, after a certain point, overflow errors will be encountered. For this reason, the Real Statistics functions described below use a bit more sophisticated approach.

When t < 0, the noncentral t distribution is defined as

image7304

Algorithm for pdf

The probability density function (pdf) of the noncentral t distribution can be calculated as follows:

image7305

Characteristics

The mean and variance of the distribution are

image7306

image7307

The shape of the noncentral t distribution is similar to that of the central t distribution (i.e. the ordinary t distribution). The noncentrality parameter indicates how much the distribution is shifted to the right (when δ > 0) or to the left (when δ < 0). When δ = 0, the noncentral t distribution is identical to the central t distribution, and so T(k,0) = T(k).

Graphs

The chart in Figure 1 shows the graphs of the noncentral t distribution with 10 degrees of freedom for δ = 0, 2, 4, and 6.

noncentral-t-distribution

Figure 1 – Noncentral t pdf by noncentrality parameter

The chart in Figure 2 shows the graphs of the noncentral t distribution with δ = 2 and the degrees of freedom = 1, 3, 5, 10.

Noncentral t distribution df

Figure 2 – Noncentral t pdf by degrees of freedom

Worksheet Functions

Real Statistics Functions: The Real Statistics Resource Pack supplies the following worksheet functions:

NT_DIST(t, df, δ, cum, iter, prec). If cum = TRUE then the value of the noncentral t distribution T(k,δ) at t is returned, while If cum = FALSE then the value of the noncentral pdf at t is returned.

NT_INV(p, df, δ, iter, iter0, prec) = the inverse of the cdf of the noncentral t distribution T(k,δ) at p, i.e. the value of t such that NT_DIST(t, df, δ, TRUE, iter, prec) = p.

NT_NCP(p, df, t, iter, iter0, prec) = the value of the noncentrality parameter δ such the cdf of the noncentral distribution T(k,δ) at t is p, i.e. NT_DIST(t, df, δ, TRUE, iter, prec) = p.

Here iter = the maximum number of terms from the infinite sum (default 1000) and prec = the maximum amount of error acceptable in the estimate of the infinite sum unless the iteration limit is reached first (default = 0.000000000001). iter0 = the number of iterations used in calculating NT_INV or NT_NCP by binary search (default 40).

Note that NT_DIST(4.5,10,4,FALSE) = .25497 and NT_DIST(4.5,10,4,TRUE) = .60368, which is consistent with the values shown in the green curve of Figure 1.

References

Steier, J. F. and Fouladi, R. T. (1997) Noncentrality interval estimation and the evaluation of statistical models
http://www.statpower.net/Steiger%20Biblio/Steiger&Fouladi97.PDF

Scholz, F. W. (2008) Applications of the Noncentral t-Distribution
https://faculty.washington.edu/fscholz/DATAFILES498B2008/NoncentralT.pdf

Krishnamoorthy, K. (2006) Handbook of statistical distributions with applications. Chapman and Hall
https://www.academia.edu/41846183/Handbook_of_Statistical_Distributions_with_Applications

Benton, D. and Krishnamoorthy, K. (2003) Computing discrete mixtures of continuous
distributions: noncentral chisquare, noncentral t and the distribution of the square of the sample multiple correlation coefficient. Computational Statistics & Data Analysis 43. 249 – 267

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3142698

20 thoughts on “Noncentral t Distribution”

  1. Hello

    I was using the Noncentral t and F functions. Maybe you can explain something. When I review the relation between the F() and t() distribution functions with noncentrality (d) = 0, the Probability for the F(x,1,v2, true) [d = 0] will equal (1+t(x^0.5, v2, true))/2. For example, F(9,1,4,true) = 0.960058, and the t(9^0.5, 4, true) = 0.980029. And (1 + F()) / 2 = 0.980029. So this holds for d = 0

    However, in Abramowitz and Stegan’s book (Handbook of Mathematical Functions, Eq. 26.6.19), they show the relationship between the F and t-Distribution with a nonzero noncentrality parameter (d) to be given by:

    F(x,1,v2,true,d) = t(x^0.5, v2, true, d^0.5)

    All I can find in the references is that this relation “holds” for all values of d.

    I ran the calculations for various d {0…11) using your spreadsheet and found that for small values of d (i. e. d = 1), the relation does not hold, but as d increases, the values converge to the relationship.

    Nf_Dist(9,1,4,TRUE,1) = 0.899141 nt_dist(3,4,True,1) = 0.934475. Delta = 0.00218
    Nf_Dist(9,1,4,TRUE,3) = 0.772714 nt_dist(3,4,True,1.73205081) = 0.772991. Delta = 0.00028
    Etc.

    It appears that as “d” increases the values converge to the Abramowitz relation defined in equation 26.6.19.

    Maybe I did something wrong, but do you confirm this, and is there an explanation as to why this relation does not hold for small noncentrality parameters and “converges” at higher values?

    Thanks!

    Dan

    Reply
  2. Hi Charles
    I want to use the NT_INV function to calulate the deviate t (also called the tolerance factor k) with both probability and confidence 95% for incresing df.
    The NT_INV(0.95;df;1.6445) provides me with the following results
    df NT_INV k95,95
    2 26.5460546 26.26
    3 8.377973059 7.655
    4 5.9566958 5.145
    5 5.076619865 4.202
    6 4.62913949 3.707
    7 4.359690547 3.399
    8 4.180055166 3.188
    9 4.051878325 3.031
    10 3.955872695 2.911
    1000 3.294991017
    10000 3.29023463
    100000 3.289759982
    1000000 3.289712527
    10000000 3.289707781
    100000000 3.289707307
    1000000000 3.289707259
    10000000000 3.289707254
    1E+11 3.289707254
    1E+12 3.289707254
    1E+13 3.289707254
    1E+14 3.289707254 1.6448
    At the right the expected k95,95 tolerance limits as found in the handbooks.
    The NT_INV does not asymptotically go to 1.6448 for df->infinity as would be expected.
    Am I doing something wrong? Or is it a bug?

    Reply
      • Dear Charles
        Indeed, it seems twice the expected value.
        My colleague tried it as well and came to the same results.
        I’ve tried to work with adapted df, but that made it even worse.
        Are you able to solve it? I can send you the VB6 algorithm I use in the Hyginist program (http://www.tsac.nl/hyginist.html)

        Maybe my lay-out was confusing
        N NT_INV literature
        2 26.5460546 26.26
        3 8.377973059 7.655
        4 5.9566958 5.145
        5 5.076619865 4.202
        inf. 3.289707254 1.6445

        Reply
      • A colleague suggested to use in EXCEL =NT_INV(NT_DIST;df;NORM.S.INV(p)*SQRT(N))/SQRT(N)
        so with p the desired probability, the value of the noncentrality parameter δ=NORM.S.INV(NT_DIST)*SQRT(N) and NT_DIST the desired confidence.
        And than devide the resulting NT_INV() by SQRT(N)
        This results in δ which corresponds with the tables in literature

        Please confirm this is the right approach

        Reply
  3. Hello Charles,

    I was looking at your “CI Functions for Effect Size d” article, and I cannot find the NT_NCP function (Noncentral t Distribution) on Excel (version 16.54). I need to calculate Confidence Intervals for Cohen’s d when only means and their standard deviations of two groups are given.

    What should I do about this?

    Reply
  4. Hello Charles,

    I left a post on this topic already but may have done so on the wrong page and have simplified the question a bit. It seems more appropriate to post here:

    Some texts and websites state the non-central t can be used to estimate confidence intervals of percentiles. For instance, the left -3 sigma tail of a standard Normal distribution has area = 0.00135 or 0.135% and the following Matlab command illustrates what is supposed to work to find the left 95% confidence tail (2.5% cumulative area) for the 0.135% tail if sampled many times (10 million) with 10,000 samples each:

    n=10^4
    Z=-3.00
    C=-1.96
    [nctinv(1-normcdf(C,0,1), n-1, -1*sqrt(n)*Z) * 1 / sqrt(n), nctinv(normcdf(C, 0,1), n-1, -1*sqrt(n)*Z) * 1/sqrt(n)]

    NonCentrality Parameter = Delta / [Sigma/Sqrt(n)] = Z*Sqrt(n) = -3*Sqrt(n)
    StdErrorOfEstimateForMean = SEM = 1/Sqrt(n) and texts use this value.
    This produced (-3.0466, -2.9546) which is too narrow.

    And Hald (1952) shows a formula that works very well and Matches Monte Carlo:
    Variation = 1/NORM.DIST(-3,0,1,FALSE)^2*NORM.DIST(-3,0,1,TRUE)*(1-NORM.DIST(-3,0,1,TRUE))/10000.
    (It also worked for the 95% limits of the 95% tails and the 95% limits for the Median)

    StdDev = Sqrt(Variance)
    Left 95% Tail Of 0.135% Percentile = -3 – 1.96 * StdDev = -3.1624 (MC showed -3.17)
    Right 95% Tail Of 0.135% Percentile = -3 + 1.96 * StdDev = -2.8376 (MC showed -2.84)

    Can you please show how to properly use the noncentral t to get similar results, especially using Excel for the non-central as you did for other problems?

    Thanks,
    Bruce

    Reply
    • Hello Bruce,
      I responded to your earlier comment yesterday. I repeat my response as follows.
      Thank you for your kind words about the Real Statistics website.
      From your comment, I understand that the simulation that you are proposing is based on the non-central t distribution.
      The Real Statistics software supports the non-central t distribution. In particular, it provides the inverse function NT_INV(p, df, ncp).
      For any given value of df and ncp, you can obtain a random value from the stated non-central t distribution by using the formula
      =NT_INV(RAND(), df, ncp)
      You can then generate as many random values as you like and estimate the desired parameter(s) and get confidence intervals. I don’t know a priori how big a sample is required to achieve the accuracy that you are looking for, but the good news about Monte Carlo simulations is that you can estimate the confidence interval and with a little experimentation you should be able to make an educated guess to the sample size required.
      I don’t know of a closed-form solution, but perhaps the Real Statistics function =NT_DIST(x,df,ncp,cum) could be useful.
      Charles

      Reply
  5. Charles,

    What does the “t” stand for in NT_DIST(t, df, sigma, cum, m).

    Is it the Tstat derived from the problem? # of tails? You never specify.

    Also, if it’s the Tstat derived from the problem, from what I can tell this ought to be the same as the NCP, which most of the time will give an answer of approximately 0.5.

    Thank you

    Reply
    • Or rather, if you use the same value for t and sigma in the NT_DIST function then it will usually give an answer that’s approximately 0.5.

      Reply
    • Okay, sorry if I’m overcrowding things here, but upon further inspection it looks like if you use the 2-tailed Tcrit for the NT_DIST function this gives answers that are almost identical to the statistical power option in the RealStats plug in.

      Am I correct here?

      Thanks.

      Reply
    • Jonathan,
      The formula is NT_DIST(t, df, ncp, cum, m), where ncp is the noncentrality parameter. The t is the same as the t in T.DIST(t,df,cum). In fact, when ncp = 0 then NT_DIST(t,df,0,cum) = T.DIST(t,df,cum).
      Charles

      Reply
  6. Hi, Charles

    I have tried to use the function NT_INV for the purpose of calculating one-side tolerance intervals, with the parameters p=0.95, df=50, delta=11.63, m=170. It did not provide a solution. Actually it did not work with df larger than 50 and delta larger than 11. Please help.

    Sam

    Reply
    • Sam,

      Some observations:

      1. For some values of p, df and delta the value for m must be less than 170. E.g. for your example, if you change m = 170 to m = 168, the function will generate the correct answer. I need to improve the function to avoid this problem.

      2. There are solutions for values of df larger than 50 or delta larger than 11. E.g. NT_INV(.95,60,12,165) = 14.79959488, NT_INV(.95,100,12,150) = 14.3688301.

      3. But the function doesn’t seem to be able to find all such solutions. E.g. NT_INV(.95,60,20,m) does not find the right value, which I believe is about 24. I need to fix this.

      Thanks for identifying this problem.

      Charles

      Reply
  7. Hi,

    I have the following doubt about this distribution:
    How small should we consider t to calculate the pdf as t=0. For example, if we have t=1E-10 and use the algorithm for x not zero we ca<n introduce a distortion in the graphic that bis noticeable in certain cases. We can say that the second algorithm must be used not only for t=o but 'in the vicinity of 0'.
    Did you felt this this probçem and Have you any idea of how to define its limits?

    António Teixeira

    Reply
    • António,

      Excellent point. I have checked the pdf values for t = E-5, E-6, E-7, E-8, E-9, E-10, 0 with df = 1 to 20 and ncp = 4, 3, 2, 1, .5, .1, .01, all carried out to 8 decimal places.

      For ncp = 3, the value at t = 0 is always the same as the value at E-9. For 7 values of df the pdf values at t = E-10 is higher than that at t = E-9 (this theoretically shouldn’t happen), the difference is at most .00000003. For 5 values of df the pdf value at t = E-10 is lower than that at t = E-9, the difference is at most .00000002.

      For the other values of ncp, usually the pdf value at t = 0 is equal to that of t = E-8, E-9 or somewhere in between, although occasionally at E-7. There seems to be more distortion at E-10 where fairly often the pdf value at E-10 is higher than at E-9, although sometimes this starts to happen (although not for all values of df) at E-9 or E-8.

      Based on this analysis, I would say that for ncp >= 1 the second value of the pdf could be used for t < E-8. For ncp < 1 perhaps this should be for t < E-7.

      Charles

      Reply

Leave a Comment