Analysis of Skewness and Kurtosis

Guidelines

Since the skewness and kurtosis of the normal distribution are zero, values for these two parameters should be close to zero for data to follow a normal distribution.

  • A rough measure of the standard error of the skewness is \sqrt{6/n} where n is the sample size.
  • A rough measure of the standard error of the kurtosis is \sqrt{24/n} where n is the sample size.

If the absolute value of the skewness for the data is more than twice the standard error this indicates that the data are not symmetric, and therefore not normal. Similarly, if the absolute value of the kurtosis for the data is more than twice the standard error this is also an indication that the data are not normal.

Example

Example 1: Use the above guidelines to gain more evidence as to whether the data in Example 1 of Graphical Tests for Normality and Symmetry are normally distributed.

As we can see from Graphical Tests for Normality and Symmetry, the skewness is SKEW(A4:A23) = .23 (cell D13) with standard error SQRT(6/COUNT(A4:A23)) = .55 (cell D16). Since .23 < 2*.55 = 1.10, the skewness is acceptable for a normal distribution. Also the kurtosis is KURT(A4:A23) = -1.53 (cell D14) with standard error of SQRT(24/COUNT(A4:A23)) = 1.10 (cell D17). Since 1.53 < 2*1.10 = 2.20, the kurtosis is also acceptable for a normal distribution.

Jarque-Barre Test

Related to the above approach is the Jarque-Barre (JB) test for normality which tests the null hypothesis that data from a sample of size n with skewness skew and kurtosis kurt. This test is based on the following property when the null hypothesis holds.

image9257

For Example 1

image9258

based on using the Excel worksheet functions SKEW and KURT to calculate the sample skewness and kurtosis values. Since CHISQ.DIST.RT(2.13, 2) = .345 > .05, we conclude there isn’t sufficient evidence to rule out the data coming from a normal population.

The JB test can also be performed using the population values of skewness and kurtosis, SKEWP (or SKEW.P) and KURTP functions (instead of SKEW and KURT). 

image116c

Since CHISQ.DIST.RT(1.93, 2) = .382 > .05, once again we conclude there isn’t sufficient evidence to rule out the data coming from a normal population.

Worksheet Functions

Real Statistics Functions: The Real Statistics Resource Pack supplies the following functions.

JARQUE(R1, pop) = the Jarque-Barre test statistic JB for the data in the range R1

JBTEST(R1, pop) = p-value of the Jarque-Barre test on the data in R1

If pop = TRUE (default), the population version of the test is used; otherwise the sample version of the test is used. Any empty cells or cells containing non-numeric data are ignored.

For Example 1, we see that JARQUE(A4:A23) = 1.93 and JBTEST(A4:A23) = .382. Similarly, JARQUE(A4:A23, FALSE) = 2.13 and JBTEST(A4:A23, FALSE) = .345.

d’Agostino-Pearson Test

The d’Agostino-Pearson test of normality is also based on testing the skewness and kurtosis. This test is more accurate and so is more commonly used than the JB test. See D’Agostino-Pearson Test for details.

Reference

Wikipedia (2012) Jarque-Bera test
https://en.wikipedia.org/wiki/Jarque%E2%80%93Bera_test

39 thoughts on “Analysis of Skewness and Kurtosis”

  1. In every statistics book, JB is calculated with ((C-3)^2)/24, not just (C^2)/24.
    The same is true for the JB test statistic on the wikipedia page you refer to.

    Reply
  2. dear charles,
    could you tell me a reference for the calculation of the standard error of the skewness and of the kurtosis?
    kind regards,
    Adrian

    Reply
  3. Hi, could someone tell me what the ‘absolute’ skew and kurtosis values are in terms of SPSS output please? I understand you get the z-scores from doing skew/skew.error and same with kurtosis, but I do not know what the ‘absolute’ values are, and I am trying to follow guidance form Kim 2013) where it says to use absolute skew and kurtosis values with a cut off of >2 and >7 respectively for normality.

    Thank you!

    Reference:
    Kim, H. Y. (2013). Statistical notes for clinical researchers: assessing normal distribution (2) using skewness and kurtosis. Restorative dentistry & endodontics, 38(1), 52.

    Reply
  4. Hello!
    Are skewness and kurtosis dependent? How to interpret the results when skewness says it is a normal distribution but kurtosis says the opposite?

    Reply
    • Tania,
      There are distributions where the skewness is near zero but the kurtosis is significantly different from zero and there are other distributions where the kurtosis is near zero but the skewness is significantly different from zero. For data that comes from a normally distributed population both the skewness and kurtosis can’t be significantly different from zero.
      Charles

      Reply
  5. hello and thank you!
    I only have one more, general, question:
    are there rough measures of the standard errors of mean value and std. deviation too? (i mean is there a proportionality formula like: sqrt(c/n)? )

    Reply
  6. What does it indicate if the skewness and kurtosis value is given as …. and no value… can that be ignored or what can be done

    Reply
  7. I have carry out a study and test for normality of my data which i discovered that one of the variable has a missing value in the Kurtosis. What is the problem or interpretation to this missing variable in kurtosis?

    Reply
    • How big is the data set for this variable? If the one missing value is missing at random, then you should be able to ignore this issue and simply test for kurtosis without the missing value.
      Charles

      Reply
  8. Thanks for this, Charles it is really useful.

    One quick thing I don’t really understand, sorry if it’s too basic. To reject the null hypothesis (Ho) we look at the value of the chi square distribution with two degrees of freedom of the JB statistic i.e. CHISQ.DIST.RT(JB statistic, 2). If this figure is bigger than the significance level then we can’t reject Ho. In the example for a 5% significance level (or 95% confidence interval) we can’t reject the distribution follows a normal distribution as CHISQ.DIST.RT(1.93, 2) = .382 > .05.

    If rather than using a 5% significance level we use a 95% we will reject Ho. Does this make sense? With a lower confidence interval we reject Ho?

    Thanks in advance for your help

    Reply
    • Hello Antonio,
      If CHISQ.DIST.RT(1.93, 2) = .382 > .05, then don’t reject the null hypothesis that the data comes from a normally distributed population. This means that we are sufficiently satisfied that we have a normal distribution.
      We never use an alpha value bigger than or equal to 50%, and so 95% is not used (except that a confidence level of 95% is the same as a significance level of 1-.95 = .05).
      Charles

      Reply
  9. I think there is some thing wrong with this formula
    for example for this series
    26.83946269
    26.95131935
    8.371060164
    10.40495872
    18.38858378
    20.12905135
    24.2843167
    1.76670796
    20.19191695
    41.06557085
    16.09877032
    13.34390071
    0.426210193
    28.31166689
    11.89051087
    109.3641761
    25.50859431
    61.26802436
    32.5178008
    66.58119511
    41.27546773
    14.67351611
    2.048435245
    28.01590722
    44.93746991

    the JARQUE(R1)=38.28239095
    but if we use an array formula like this:
    =COUNT(A2:A26)*(((((SUM((A2:A26-AVERAGE(A2:A26))^3)/COUNT(A2:A26))/((SUM((A2:A26-AVERAGE(A2:A26))^2)/COUNT(A2:A26))^1.5))^2)/6)+((((SUM((A2:A26-AVERAGE(A2:A26))^4)/COUNT(A2:A26))/((SUM((A2:A26-AVERAGE(A2:A26))^2)/COUNT(A2:A26))^2)-3))^2)/24)
    + CTRL + SHIFT + ENTER
    the answer will be: 26.69155055
    not to mention, I completely sure about this formula to be the Jarque–Bera test coefficient.

    Reply
      • Then there is some thing wrong (bug) in excel formula, since I calculated the SKEW, KURT and JB with “EViews 9.5” and my array formula turn up to be the correct answer!

        Reply
          • EViews 9.5:
            SKEW= 1.769081
            KURT= 3.620125
            JB= 26.69155

            Excel regular formula:
            =SKEW(A2:A26) = 1.884063081
            =SKEW.P(A2:A26) =1.769080723
            =KURT(A2:A26) = 4.748928357
            Note: there is no KURT.P!!!

            Excel array formula:
            for SKEW
            =((SUM((A2:A26-AVERAGE(A2:A26))^3)/COUNT(A2:A26))/((SUM((A2:A26-AVERAGE(A2:A26))^2)/COUNT(A2:A26))^1.5))
            + CTRL + SHIFT + ENTER
            =1.769080723

            for KURT
            =((SUM((A2:A26-AVERAGE(A2:A26))^4)/COUNT(A2:A26))/((SUM((A2:A26-AVERAGE(A2:A26))^2)/COUNT(A2:A26))^2)-3)
            + CTRL + SHIFT + ENTER
            =3.620124598

          • Soharb,
            Thanks for sending me this information. It looks like if we use the population values of skewness and kurtosis then we get the result that you have seen from EViews.
            In particular, the Real Statistics Resource Pack has functions SKEWP and KURTP. If these functions are used then the formula =COUNT(A2:A26)*(SKEWP(A2:A26)^2/6+KURT(A2:A26)^2/24) yields the result 26.69155.
            Thanks for bringing this up. I will revise the JARQUE and JBTEST functions in the next release of the software.
            Charles

  10. Hi and congrats for the great initiative.

    When you refer to Kurtosis, you mean the Excess kurtosis (i.e. kurt-3) or the outright kurtosis? For example when I perform the “D’Agostino-Pearson Test” as described in the relevant section (i.e. using outright kurtosis) I get results suggesting rejection of the null hypothesis, even if I use Kurt=3, Skew=0, which is the ND standards stats.

    Thank you.

    Reply
  11. thank you very much for this information. i have gained a lot from it. it will be appreciated if you can please attend to the question of zohreh of february 28, 2016 @ 9.31pm . i also will like to name of the person for reference. thank you .
    david

    Reply
    • David,

      As I wrote in response to that comment

      “We often use alpha = .05 as the significance level for statistical tests. The critical value for a two tailed test of normal distribution with alpha = .05 is NORMSINV(1-.05/2) = 1.96, which is approximately 2 standard deviations (i.e. standard errors) from the mean. This is source of the rule of thumb that you are referring to.

      The Jarque-Barre and D’Agostino-Pearson tests for normality are more rigorous versions of this rule of thumb.”

      Thus, it is difficult to attribute this rule of thumb to one person, since this goes back to the beginning of statistics, or at least the use of the value 1.96. You will find this value of 1.96 in any elementary book on statistics.

      Charles

      Reply
  12. Thank you very much! The Real Statistics Functions are really of great help.
    However, I came across a problem that JBTEST, as well as DPTEST, doesn’t allow ranges expressed in array form. For example, the expression: =jbtest(IF(INDIRECT(“G”&6):INDIRECT(“G”&10)0,INDIRECT(“AE”&6):INDIRECT(“AE”&10))) cannot be recognized by Excel and the result is #VALUE!. By comparing with another expression: =jbtest(INDIRECT(“AE”&6):INDIRECT(“AE”&10)) in Evaluating Fomula, I found that JBTEST can only read data with form of “Am:Bn”, not expressed in a set of data like “0.1, 0.2, …”. Is there any solution to it? I have to deal with ranges within which there are certain values that should not be included in the test.
    Thank you again!

    Reply
    • Denny,
      The current implementation of these functions supports only arrays which are ranges. I have just changed this so that they should support any arrays. I will include these changes in the next release of the software. I hope to issue this release in the next few days.
      Charles

      Reply
  13. Salaam
    May you please cite the reference for “If the absolute value of the skewness for the data is more than twice the standard error this indicates that the data are not symmetric, and therefore not normal”. I need it. Thanks.

    Reply
    • We often use alpha = .05 as the significance level for statistical tests. The critical value for a two tailed test of normal distribution with alpha = .05 is NORMSINV(1-.05/2) = 1.96, which is approximately 2 standard deviations (i.e. standard errors) from the mean. This is source of the rule of thumb that you are referring to.

      The Jarque-Barre and D’Agostino-Pearson tests for normality are more rigorous versions of this rule of thumb.

      Charles

      Reply
      • Thanks for replying. I’ve heard that one way to check normality is to divide skewness by standard error, if the results falls between the range +-1.96, then normality will be satisfies. Using this formula my data was proved to be not normal. I used another formula to which you referred “If the absolute value of the skewness for the data is more than twice the standard error this indicates that the data are not symmetric, and therefore not normal”, then my data revealed to be normal. As I want to use the latter procedure in my study I need to cite the name of the person whose opinion I will use. By reference I meant based on whose opinion “If the absolute value of the skewness for the data is … Will you please provide the name of the person?
        Many thanks…

        Reply

Leave a Comment