Symmetry, Skewness and Kurtosis

We consider a random variable x and a data set S = {x1, x2, …, xn} of size n which contains possible values of x. The data set can represent either the population being studied or a sample drawn from the population.

Looking at S as representing a distribution, the skewness of S is a measure of symmetry while kurtosis is a measure of the relative size of the tails of this distribution.

Symmetry and Skewness

Definition 1: Skewness as a measure of symmetry. If the skewness of S is zero then the distribution represented by S is perfectly symmetric. If the skewness is negative, then the distribution is skewed to the left, while if the skewness is positive then the distribution is skewed to the right (see Figure 2 below).

Excel calculates the skewness of a sample  using the formula:

Sample skewness in Excel

where x̄ is the mean and s is the standard deviation of S. To avoid division by zero, this formula requires that n > 2.

When a distribution is symmetric, the mean = the median. When the distribution is positively skewed the mean > the median and when the distribution is negatively skewed the mean < the median.

Worksheet Functions

Excel Functions: Excel provides the SKEW function as a way to calculate the skewness of S, i.e. if R1 is a range in Excel containing the data elements in S then SKEW(R1) = the skewness of S.

There is also a population version of the skewness given by the formula

Population skewness

Excel users starting with Excel 2013 can employ the function SKEW.P to obtain the population skewness.

Users of Excel before Excel 2013 can use the formula

= SKEW(R1)*(n–2)/SQRT(n*(n–1))

instead of SKEW.P(R1) where R1 contains the data in S = {x1, …, xn} and n = COUNT(R1).

Real Statistics Function: Alternatively, you can calculate the population skewness using the SKEWP(R1) function, which is contained in the Real Statistics Resource Pack.

Example

Example 1: Suppose S = {2, 5, -1, 3, 4, 5, 0, 2}. The skewness of S = -0.43, i.e. SKEW(R1) = -0.43 where R1 contains the data in S. Since this value is negative, the curve representing the distribution is skewed to the left (i.e. the fatter part of the curve is on the right). Also SKEW.P(R1) = -0.34. See Figure 1.

Shape: skewness and kurtosis

Figure 1 – Examples of skewness and kurtosis

Observation: SKEW(R1) and SKEW.P(R1) ignore any empty cells or cells with non-numeric values.

Kurtosis

Definition 2: Kurtosis provides a measurement of the extremities (i.e. tails) of the distribution of data, and therefore indicates the presence of outliers.

Excel calculates the kurtosis of a sample S as follows:

Kurtosis Formula in Excel

where x̄ is the mean and s is the standard deviation of S. To avoid division by zero, this formula requires that n > 3.

Observation: It is commonly thought that kurtosis provides a measure of peakedness (or flatness), but this is not true. Kurtosis pertains to the extremities and not to the center of a distribution.

Worksheet Functions

Excel Function: Excel provides the KURT function as a way to calculate the kurtosis of a sample S, i.e. if R1 contains the data elements in S then KURT(R1) = the kurtosis of S.

Observation: The population kurtosis is calculated via the formula

image089x

You can obtain the population kurtosis by using the Excel formula

=(KURT(R1)*(n-2)*(n-3)/(n-1)-6)/(n+1)

Real Statistics Function: Excel does not provide a population kurtosis function, but you can use the following Real Statistics function for this purpose:

KURTP(R1, excess) = kurtosis of the distribution for the population in R1. If excess = TRUE (default) then 3 is subtracted from the result (the usual approach so that a normal distribution has a kurtosis of zero).

KURT(R1) and KURTP(R1) ignore any empty cells or cells with non-numeric values.

Example 2: Suppose S = {2, 5, -1, 3, 4, 5, 0, 2}. The kurtosis of S = -0.94, i.e. KURT(R1) = -0.94 where R1 contains the data in sample S. If S is instead a population, then the kurtosis is KURTP(R1) = -1.114. See Figure 1.

Graphical Illustration

We now look at an example of these concepts using the chi-square distribution.

Chi-square distribution

Figure 2 – Example of skewness and kurtosis

Figure 2 contains the graphs of two chi-square distributions (with different degrees of freedom df). We study the chi-square distribution elsewhere, but for now, note the following values for the kurtosis and skewness:

Comparison of skewness and kurtosis

Figure 3 – Comparison of skewness and kurtosis

Both curves are asymmetric and skewed to the right (i.e. the fat part of the curve is on the left). This is consistent with the fact that the skewness for both curves is positive. However, the blue curve (df = 5) is more skewed to the right, which is consistent with the fact that the skewness of the blue curve is larger.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

References

Howell, D. C. (2010) Statistical methods for psychology (7th ed.). Wadsworth, Cengage Learning.
https://labs.la.utexas.edu/gilden/files/2016/05/Statistics-Text.pdf

Turney, S. (2023) Skewness | Definition, examples & formula
https://www.scribbr.com/statistics/skewness/#:~:text=Skewness%20is%20a%20measure%20of,negative)%2C%20or%20zero%20skewness.

Turney, S. (2023) Kurtosis | Definition, examples & formula
https://www.scribbr.com/statistics/kurtosis/#:~:text=Kurtosis%20is%20a%20measure%20of,(thin%20tails)%20are%20platykurtic.

69 thoughts on “Symmetry, Skewness and Kurtosis”

    • Nasreen,
      It depends on what you mean by grouped data. The Real Statistics Resource Pack provides various approaches for doing this, but again it depends on what you mean by grouped data.
      Charles

      Reply
  1. does skewness and kurtosis has statistical table, please i want to learn more about how it is applied both the calculation. thanks

    Reply
  2. Can you further explain what do you mean by extremities (i.e. tails) of the distribution of data, and therefore provides an indication of the presence of outliers.

    Since, my reading suggested that Kurtosis is about peakness of the data.

    Reply
    • Hello Shazia,
      1. The extremities are simply the highest and lowest data values.
      In many distributions (e.g. the normal distribution) there is no highest or lowest value; the left tail (where the lower values lie) goes on and on (towards minus infinity), but for intervals of a fixed size on the left tail there are fewer and fewer values the farther to the left you go (and certainly far fewer values than in the middle of the distribution). You can see this on the typical bell curve of the normal distribution. The situation is similar on the right tail (where the higher values lie). It goes on towards plus infinity and for any given interval size there are fewer and fewer values on the farther you go to the right.
      When you look at a finite number of values (e.g. in a finite sample) then if some value is much smaller or much bigger than the other values, these are potential outliers. There is no precise definition of an outlier. It is a judgement call as to whether some value is an outlier, although there are guidelines (as explained on the website).
      2. Older references often state that kurtosis is an indication of peakedness. This is not correct.
      Charles

      Reply
  3. Hello, If I have a set of percentage data and if I try to find Skew for this percentage data then I get the answer in percentage say I have R = 93 data points in a set S and this 93 data points in the range R are in percentages if I apply SKEW(R) then I get answer in percentage which is equal to say 9.2 percentage, if I convert it to number format it turns out to be 0.09 what does this mean, is this data moderately skewed because it’s less than + or – 0.5 or how to consider this result in percentages( I have negative percentages in my data set, and the mean in lesser than median that means negativity skewed but the skewness is 0.09 if I convert it to number format from percentages so what’s the problem)

    Reply
    • Hello, it is difficult for me to figure out what is going on without seeing your data. If you can send me an Excel file with your data, I will try to figure out what is happening.
      Charles

      Reply
  4. In terms of financial time series data, would the measure of Skew and Kurtosis for a single position indicate which GARCH (or other) model to use in calculating it’s conditional volatility? I know this is slightly off topic, so no worries if the answer isn’t forthcoming.

    Reply
  5. Thank you Charles for your well-described functions of Skew and Kurt. My question is how these 2 factors can help me interprete the normality of my data. For example are there certain ranges in which we can be certain that our range is not normal. For example, the Kurtosis of my data is 1.90 and Skewness is 1.67. How these 2 numbers could help me know if running a t-test would be meaningful on this dataset?

    Thank you in advance

    Reply
    • Kath,
      I am not sure I know what you mean by grouped and ungrouped data. Say you have a range of data A1:C10 in Excel, where the data for each of three groups is the data in each of the columns in the range. Then the overall skewness can be calculated by the formula =SKEW(A1:C10), but the skewness for each group can be calculated by the formulas =SKEW(A1,A10), =SKEW(B1:B10) and =SKEW(C1:C10).
      Charles

      Reply
    • Hafiz,
      The distribution is skewed to the left. Skewness of -.999 (i.e. about -1) is usually consistent with data that is normally distributed (skewness = zero), but whether the data is normally distributed depends on other factors as well.
      Charles

      Reply
  6. Hi Charles,
    How do I incorporate weights in the skewness calculation? Say the value 5 appear 3 times, 8 appears 2 times and 9 appears once. I have the formula SKEW(5, 8, 9) – using cell references, but would like the calculation to be SKEW(5, 5, 5, 8, 8, 9).
    Kind regards,
    Maree

    Reply
    • Pranjal Srivastava,
      To test for symmetry algebraically about the y axis you take the equation y = f(x) and substitute -x for x and see whether you get the same equation back. Similarly, you can test for symmetry about the x axis or about the origin.
      In the referenced webpage, I am not testing for 100% symmetry. I am testing whether the data is symmetric enough that I can use one of the standard statistical tests.
      Charles

      Reply
  7. Please let me know if we have some data set with sizes with volume percentages to calculate skewness and kurtosis, Do I need to divide the data set into same size classes or different size classes is okay.

    Reply
  8. Sir, if the value of the SKEWNESS is zero, it means that the distribution in the curve is symmetric, if the value falls within -0.49 <SK< 0.49 (since -0.49 and 0.49 when rounded of is 0), may i say that the distribution may still be SYMMETRIC?

    how about in kurtosis, if the value is within 2.50 <KU<3.49 (since 2.50 and 3.49 when rounded of is 3), may i say that the distribution may still be MESOKURTIC?

    Thank you very much

    Reply
    • Xiaobin,
      The two statistics that you reference are completely different from the measurement that I have described. I have never used the measures that you have referenced. I presume that measure skewness and are easier to calculate than the standard measurement (which is the one that I describe) and so are less accurate.
      See the following webpage for further explanation:
      https://en.wikipedia.org/wiki/Skewness
      Charles

      Reply
  9. hello,
    the Kurtosis value on my data is above 2 (+3). i think it should be between negative and positive 2. how can I change it to obtain normality??

    Reply
  10. Hey Charles

    Say you had a bunch of returns data and wished to check the skewness of that data. In this instance, which would be appropriate – Skew() or Skew.P()

    I would imagine Skew() because Skew.P() refers to a population and you don’t have the population here, you merely have a bunch of return data don’t you. OR when dealing with financial returns do you assume that the data you have is the population?

    Reply
  11. I want two suggestion
    1. I have 1000 dollar money i wants to distribute it in 12 month in such a way that peak is 1.6 time the average ( using normal distribution curve)
    2. As per my knowledge the peak in bell curve is attended in mean (i.e by 6.5 month) but if i want peak at 40% month (i.e 12*40/100 time ) and peak will still remain 1.6 time the average( i.e peak= 1.6*100/12) than what will be the distribution

    Reply
    • The peak is usually considered to be the high point in the curve, which for a normal distribution occurs at the mean. Thus, I don’t know what it means for the peak to be 1.6 times the average (which is the mean). Please explain what you mean by the peak?
      Charles

      Reply
  12. Based on my experience of teaching the statistics, you can use pearson coefficient of skewness which is = mean – mode divide by standard deviation or use this = 3(mean – median) divide by standard deviation. mostly book covered use the first formula for ungrouped data and second formula for grouped data

    Reply
    • Namo,
      I am not sure what you mean by a graphic illustration. I have tried to do this with the graph of the chi-square distribution, which was done using Excel (see the details in the Examples Workbook, which you can download for free).
      Charles

      Reply

Leave a Comment