On this webpage, we briefly touch upon using the Chi-square, Kolmogorov-Smirnov and Shapiro-Wilk tests to determine whether data is normally distributed.
Chi-square Test
The chi-square goodness of fit test can be used to determine whether data adequately fit a specified distribution. In particular, in Example 4 of Goodness of Fit, we show how to test whether data fit a Poisson distribution. In a similar fashion, we can test whether data fit a normal distribution.
For additional information and some examples click here.
Kolmogorov-Smirnov (KS) Test
The KS test is a general test that can be used to determine whether sample data is consistent with any specific distribution. In particular, it can be used to check for normality, but it tends to be less powerful than tests specifically designed to check for normality.
It has the advantage over the chi-square test in that it can be used for small samples and does not require that data frequencies be larger than 5.
For additional information and some examples click here.
Lilliefors Test
The Lilliefors Test is a version of the Kolmogorov-Smirnov test that is designed specifically to test for a normal distribution.
For additional information and some examples click here.
Shapiro-Wilk (SW) Test
The SW test is specifically designed to test the null hypothesis that data are sampled from a normal distribution. This test has the following characteristics:
- The SW test is more powerful than the KS test and is the test that we recommend to test normality, except when there are a number of tied data values
- The mean and variance do not need to be specified in advance.
- In essence, the SW test provides a correlation between the raw data and the values that would be expected if the observations followed a normal distribution. The SW statistic tests if this correlation is different from 1 (see Basic Concepts of Correlation).
- The SW test is a relatively powerful test of non-normality and is capable of detecting even small departures from normality even with small sample sizes. This may make it even more powerful than we need (i.e. data that fails the SW test may still be suitable for the test under consideration).
We provide two approaches: the original algorithm of Shapiro-Wilk (limited to samples of size 3 to 50) and an expanded algorithm due to J.P. Royston which supports samples of size 12 to 5,000. Both approaches are supported by the Real Statistics Resource Pack.
For additional information and some examples of the original approach click here.
For additional information and some examples of the expanded approach click here.
Jarque-Barre Test
A data set that is normally distributed has skewness and kurtosis of zero. This fact is the basis of a simple test of normality called the Jarque-Barre Test.
For additional information and some examples click here.
D’Agostino-Pearson Test
The D’Agostino-Pearson Test also uses the fact that a normally distributed data set has zero skewness and kurtosis. This test is more accurate than the Jarque-Barre test mentioned above.
For additional information and some examples click here.
Anderson-Darling (AD) Test
The Anderson-Darling test can be used to test for a variety of distributions, and so you need to specify that you are testing for normality. This can be accomplished by using the Real Statistics formula ADTEST(R1,”norm”) where R1 contains the data that you want to test.
For additional information and some examples click here.
References
Wikipedia (2012) Normality test
https://en.wikipedia.org/wiki/Normality_test
Ghasemi, A., Zahediasi, S. (2012) Normality tests for statistical analysis: a guide for non-statisticians
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3693611/
Datatab (2024) Normality test
https://datatab.net/tutorial/test-of-normality
Hernandez, H. (2021) Testing for normality: what is the best method?
https://www.academia.edu/47769079/Testing_for_Normality_What_is_the_Best_Method
Hi Charles,
Apart from log transforming a data set and performing a test for normality on it, what other tests can be used to show that the distribution of a data set is log-normal?
Thanks
Newman
Newman,
ùAll the tests for normality can be converted into a test for log-normality.
In particular, you can use the Anderson-Darling test.
Charles
Dr. Zaiontz,
Mostly just wanted to say “thank you so much!” for all your hard work putting XRealStats package together, updated, and keeping it supported… Truly a fantastic resource. I might add, I’ve had multiple undergrad and grad stats classes and none compare to the helpfulness of your Excel Add-In. I’ve thought for years that going the route of teaching stats through Excel was a better way to go for most people rather than bundled with having to learn a new computer program… R, SAS, SPSS, MiniTab, etc. So glad to have discovered your work just when I had to dust off what little I remembered for a Proceedings paper I’m publishing (you could make an argument for co-authorship!)… and preparing to analyze my dissertation project.
Anyway, just a thought here… after the Shapiro-Wilk Test indicates significant departure from normality, it may be helpful to offer some built in transformations (in case you haven’t already and I just haven’t found them yet) and some background discussions/justifications/examples. Just a thought from the peanut gallery.
Thanks again and hope you’re doing okay in the midst of everything going on right now.
What is the difference between Test for symmetry and test for normality?
Elsa,
You can use the SKEWTEST for symmetry and the Shapiro-Wilk test for normality. Normality implies symmetry, but data can be symmetric without being normally distributed.
Charles
What is the assumption of Test for symmetry?
The advantage and disadvantage?
Hello Elsa,
You can test for symmetry using the Box Plot or Histogram graph. Alternatively you can test whether the skewness is zero. See the following webpage for the skewness test:
https://real-statistics.com/tests-normality-and-symmetry/statistical-tests-normality-symmetry/dagostino-pearson-test/
I don’t know of any assumptions for using this test, except that the data elements are selected randomly.
I don’t know of any disadvantages. The advantages are that you have a pretty clear and simple test for symmetry.
Charles
Good day Mr. Charles.. What is test for symmetry?
What is the purposed of test for symmetry?
What is the implication of test for symmetry?
Hello Elsa,
You can test for symmetry using the Box Plot or Histogram graph. Alternatively you can test whether the skewness is zero. See the following webpage for the skewness test:
https://real-statistics.com/tests-normality-and-symmetry/statistical-tests-normality-symmetry/dagostino-pearson-test/
The main reason that you want to test for symmetry is that symmetry is an assumption for other tests. Also some tests that require that the data be normally distributed and quite robust to violations of normality especially when the data is at least symmetrically distributed.
Charles
Very Insightful, thank you very much! 🙂