Shapiro-Wilk Original Test

Basic Concepts

We present the original approach to performing the Shapiro-Wilk Test. This approach is limited to samples between 3 and 50 elements. By clicking here you can also review a revised approach using the algorithm of J. P. Royston which can handle samples with up to 5,000 (or even more).

The basic approach used in the Shapiro-Wilk (SW) test for normality is as follows:

Arrange the data in ascending order so that x₁ ≤ … ≤ x_n.
Calculate SS as follows:

If n is even, let m = n/2, while if n is odd let m = (n–1)/2
Calculate b as follows, taking the a_i weights from Table 1 (based on the value of n) in the Shapiro-Wilk Tables. Note that if n is odd, the median data value is not used in the calculation of b.

Calculate the test statistic W = b² ⁄ SS
Find the value in Table 2 of the Shapiro-Wilk Tables (for a given value of n) that is closest to W, interpolating if necessary. This is the p-value for the test.

For example, suppose W = .975 and n = 10. Based on Table 2 of the Shapiro-Wilk Tables the p-value for the test is somewhere between .90 (W = .972) and .95 (W = .978). You can estimate this p-value using interpolation (see Interpolation).

Examples

Example 1: A random sample of 12 people is taken from a large population. The ages of the people in the sample are shown in column A of the worksheet in Figure 1. Is this data normally distributed?

Figure 1 – Shapiro-Wilk test for Example 1

We begin by sorting the data in column A using Data > Sort & Filter|Sort (see Sorting and Filtering) or the Real Statistics QSORT function (see Sorting and Removing Duplicates), putting the results in column B. We next look up the coefficient values for n = 12 (the sample size) in Table 1 of the Shapiro-Wilk Tables, putting these values in column E.

Corresponding to each of these 6 coefficients a₁,…,a₆, we calculate the values x₁₂ – x₁, …, x₇ – x₆, where x_i is the ith data element in sorted order. E.g. since x₁ = 35 and x₁₂ = 86, we place the difference 86 – 35 = 51 in cell H5 (the same row as the cell containing coefficient a₁). Column I contains the product of the coefficients and difference values. E.g. cell I5 contains the formula =E5*H5. The sum of these values is b = 44.1641, which is found in cell I11 (and again in cell E14).

We next calculate SS as DEVSQ(B4:B15) = 2008.667 (cell E13). Thus W = b² ⁄ SS = 44.1641^2/2008.667 = .971026 (cell E15).

p-value using interpolation

We now look for .971026 when n = 12 in Table 2 of the Shapiro-Wilk Tables and find that the p-value lies between .50 and .90. The W value for .5 is .943 and the W value for .9 is .973.

Interpolating .971026 between these values (using linear interpolation), we arrive at p-value = .873681. Since p-value = .87 > .05 = α, we retain the null hypothesis that the data are normally distributed. Since this p-value is based on linear interpolation, it is not very accurate, but the important thing is that it is much higher than the alpha value, and so we can retain the null hypothesis that the data is normally distributed.

Comparison with other tests

Example 2: Using the SW test, determine whether the data in Example 1 of Graphical Tests for Normality and Symmetry (repeated in column A of Figure 2) are normally distributed.

Figure 2 – Shapiro-Wilk test for Example 2

As we can see from the analysis in Figure 2, p-value = .0432 < .05 = α, and so we reject the null hypothesis and conclude with 95% confidence that the data are not normally distributed, which is quite different from the results using the KS test that we found in Example 2 of Kolmogorov-Smironov Test, but consistent with the QQ plot shown in Figure 5 of that webpage.

Real Statistics Support

Real Statistics Function: The Real Statistics Resource Pack contains the following functions.

SHAPIRO(R1, FALSE) = the Shapiro-Wilk test statistic W for the data in R1

SWTEST(R1, FALSE, interp) = p-value of the Shapiro-Wilk test on the data in R1

SWCoeff(n, j, FALSE) = the jth coefficient for samples of size n

SWCoeff(R1, C1, FALSE) = the coefficient corresponding to cell C1 within sorted range R1

SWPROB(n, W, FALSE, interp) = p-value of the Shapiro-Wilk test for a sample of size n for test statistic W

The functions SHAPIRO and SWTEST ignore all empty and non-numeric cells. The range R1 in SWCoeff(R1, C1, FALSE) should not contain any empty or non-numeric cells.

When performing the table lookup, the default is to use the recommended type of interpolation (interp = TRUE). To use linear interpolation, set interp to FALSE. See Interpolation for details.

For Example 1 of Chi-square Test for Normality, SHAPIRO(A4:A15, FALSE) = .874 and SWTEST(A4:A15, FALSE, FALSE) = SWPROB(15,.874,FALSE,FALSE) = .0419 (referring to the worksheet in Figure 2 of Chi-square Test for Normality).

Note that SHAPIRO(R1, TRUE), SWTEST(R1, TRUE), SWCoeff(n, j, TRUE), SWCoeff(R1, C1, TRUE), and SWPROB(n, W, TRUE) refer to the results using the Royston algorithm, as described in Shapiro-Wilk Expanded Test.

For compatibility with the Royston version of SWCoeff, when j ≤ n/2 then SWCoeff(n, j, FALSE) = the negative of the value of the jth coefficient for samples of size n found in the Shapiro-Wilk Tables. When j = (n+1)/2, SWCoeff(n, j, FALSE) = 0 and when j > (n+1)/2, SWCoeff(n, j, FALSE) = -SWCoeff(n, n–j+1, FALSE).

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

Reference

Shapiro, S.S. & Wilk, M.B. (1965) An analysis of variance for normality (complete samples). Biometrika, Vol. 52, No. 3/4.

http://webspace.ship.edu/pgmarr/Geo441/Readings/Shapiro%20and%20Wilk%201965%20-%20An%20Analysis%20of%20Variance%20Test%20for%20Normality.pdf

174 thoughts on “Shapiro-Wilk Original Test”

Kristie

March 10, 2021 at 2:07 pm

Hi Charles,

I am doing a dissertation and have run outlier tests and histograms at a 90% confidence level, should my shapiro-wilk test also at 90% confidence level or the usual 95%?

Thank you
Reply
- Charles
  
  March 10, 2021 at 3:41 pm
  
  Usually, you use 95% for both tests. If you choose 90% then make sure that you state this clearly in your thesis.
  Charles
  Reply
Oscar Reyes Almora

February 4, 2021 at 8:57 pm

Hello Charles,
When performing the SW test with the data from example 1 and 2 with SPSS the results of p-value do not match, what could be the difference?
Reply
- Charles
  
  February 5, 2021 at 8:43 am
  
  Oscar,
  Perhaps SPSS is using the Royston version of the test. This is described at
  https://www.real-statistics.com/tests-normality-and-symmetry/statistical-tests-normality-symmetry/shapiro-wilk-expanded-test/
  Charles
  Reply
  - Pabitra pradhan
    
    March 10, 2021 at 8:01 am
    
    Hello what is real meaning of W and p value
    Reply
    - Charles
      
      March 10, 2021 at 8:28 am
      
      W is just a statistic that is useful in calculating the p-value. To gain more insight, you will need to read the original paper from Shapiro and Wilk. See Bibliography
      See the following for a definition of the p-value:
      Hypothesis Testing
      Charles
      Reply
Jesus Tamez

October 25, 2020 at 8:19 pm

A question Mr. Charles …

Which interpolation method ( https://www.real-statistics.com/statistics-tables/interpolation/ ) you recommend to use in Table 2 of SW original method ( https://www.real-statistics.com/statistics-tables/shapiro-wilk-table/ ) ?

Thanks a lot!
Reply
- Charles
  
  October 26, 2020 at 9:16 am
  
  Hello Jesus,
  I believe that I am using log interpolation, but it is better to use the Royston version of the SW test and avoid the whole issue of interpolation.
  Charles
  Reply
  - Jesus Tamez
    
    October 26, 2020 at 3:40 pm
    
    thanks and regards! …
    Reply
Cleonir

September 19, 2020 at 3:18 pm

Thank you very much Charles. I follow your examples for Excel in WPS Spreadsheet and the results are fine. One question, do you write some paper about this?
Thank you…
Reply
- Charles
  
  September 19, 2020 at 4:01 pm
  
  Cleonir,
  Thank you for your kind remarks.
  I have had the intention to write a book about this and other statistics subjects but supporting this website and the Real Statistics software tends to take up the spare time that I have.
  Charles
  Reply
Newman

August 14, 2020 at 1:49 pm

Hi Charles,
Thank you for the great work you are doing.
I am dealing with a set of data that has failed the Shapiro-Wilk’s test for normality (ie. the p-value is less than 0.05). I transformed the data by evaluating its natural logarithms and conducted the Shapiro-Wilk’s test on it. (hoping that the data would be lognormal). It still fails this test for log normality. According to the literature, such data sets should be lognormally distributed. What other statistical tests can I use to test whether my data is lognormal?
Thank you in anticipation
Reply
- Charles
  
  August 14, 2020 at 7:12 pm
  
  Hello Newman,
  I assume that you found that the set of ln(x) where x is in your original sample was not normally distributed per the Shapiro-Wilk test. Unless there are a lot of ties, this tends to be the best test for normality. You could try the Anderson-Darling test, but if the original data is truly lognormally distributed, then the SW test should confirm this. What is the p-value of the test?
  Charles
  Reply
James Picksley

August 3, 2020 at 2:10 pm

Hi,

With the example data (65,61,63,86,70,55,74,35,72,68,45,58) I get the following p-value from both R and a python function, 0.9216, as opposed to the 0.873681129
you get. I can reproduce your value of 0.873681129
in spreadsheet calculations. Do you have any idea why there’s a discrepancy, please?

Thanks,

James. Picksley
Reply
- Charles
  
  August 3, 2020 at 5:51 pm
  
  Hello James,
  They are using the Royston version of the Shapiro-Wilk test. This is also the default version in Real Statistics. If the data is in range A1:A12, then SWTEST(A1:A12) = .9216, while SWTEST(A1:A12,FALSE) = .8737. The problem with the calculation in the original version of the SW test is that the interpolation that is being used is probably not so accurate.
  Charles
  Reply
  - James
    
    August 4, 2020 at 11:33 am
    
    Hi Charles,
    
    Thanks for the reply. I’d just worked it out and then came here to see you’d put a response up! I spotted it because I have a set of data where W in the original test is outside of the range of table 2, so I wasn’t getting a valid result, so I ran it through the extended version and got a match with R and python.
    
    Sorry to have wasted your time. Thanks very much for this website. I’ve found it very useful over the last few years.
    
    Cheers,
    
    James.
    Reply
    - Charles
      
      August 4, 2020 at 5:24 pm
      
      James,
      No problem. Glad that I could clarify things.
      Charles
      Reply
Ruben

June 8, 2020 at 7:34 am

Primeramente saludar a Charles por su publicacion desde aqui de Bolivia. por muchos años no podia hacer correr los datos con mas de 50 muestras prueba Sapiro Willk. ahora si se puede con esta prueba ampliada. Pues llege a desifrar todas las formulas. Solo que tengo una observacion. Al final, la hipotesis de normalidad con la distribucion de probabilidad normal esta confuso con la prueba original dice no y con el ampliado dice si del ejemplo con 12 muestras. deberian coicidir ambos, en este caso la normalidad de esos datos se aceptaria o tal ves estoy equivocado. Yo acostubro trabajar al 5% de nivel de significancia y no asi con nivel de confianza que es al 95%. ya que la misma prueba te aroja a nivel de significancia. En este caso mi sale 0,078 redondeado, pero se resto de 1, seria a nivel de confianza que deberia contrastarse. Tal vez estoy muy grosero, pero muchas gracias por su aporte.
Reply
- Charles
  
  June 8, 2020 at 9:09 am
  
  Hello Ruben,
  Thank you for your kind remarks.
  Both the original and expanded versions of the Shapiro-Wilk test should give similar results. If the p-value < alpha then you have evidence that the data does not come from a normally distributed population. Charles
  Reply
David

June 2, 2020 at 6:13 pm

Charles, Can Shapiro-Wilk (SW) be used on datasets that have tied values? The State has directed me to use SW for groundwater monitoring data, and there are often tied values in groundwater data. I’ve been told there is additional data manipulation that must be done to use SW on data with tied values. I am hoping you can tell me what these additional manipulations are. Here’s an example data set: 0.075, 0.077, 0.1, 0.1, 0.1, 0.15, 0.19, 0.2, 0.2, 0.23, 0.27, 0.28, 0.28, 0.29, 0.3, 0.33, 0.34, 0.35, 0.37, 0.37, 0.44, 0.52, 0.56, 0.58. I use excel to calculate W and get W=0.9437 (without accounting for ties).
Thank you.
Reply
- Charles
  
  June 3, 2020 at 5:37 pm
  
  Hello David,
  There is a method for correcting ties in the SW test, but I am not familiar with it. The following paper describes the process:
  https://www.tandfonline.com/doi/abs/10.1080/00949658908811146?journalCode=gscs20
  Charles
  Reply
Kai

April 23, 2020 at 4:19 pm

Hi Charles,
I’m testing a bunch of my data for my dissertation so I can do further analysis.
On my last data set my W value came out being super low at 0.6927 (n=12).

As the W values in the chart don’t go down that low does this just mean that I accept the null hypothesis and my data isn’t normally distributed?
Reply
- Charles
  
  April 24, 2020 at 8:16 am
  
  Hi Kai,
  Since W is lower than the lowest value on the table this means that p-value < .01, which means that you reject the null hypothesis that your data is normally distributed. Charles
  Reply
Marc Hampton

April 3, 2020 at 7:42 pm

Hi there,

I am doing a Shapiro Wilk for n=15 data – however my w value comes out above 1. therefore i cannot continue with the calculations as shown.
Reply
- Charles
  
  April 4, 2020 at 10:56 am
  
  Hello Marc,
  If you email me an Excel file with your data and calculations, I will try to figure out what went wrong.
  Charles
  Reply
  - Ethan
    
    June 9, 2020 at 12:15 pm
    
    Hey Charles,
    Can I send you an email with 2 questions about this in trying to do the same but with few other different things
    Reply
    - Charles
      
      June 9, 2020 at 1:17 pm
      
      Yes
      Reply
  - Angie
    
    August 2, 2020 at 5:04 pm
    
    Hello Charles,
    I also have problem with my n=24 data where the w value exceeds 1 (1.159902956) but I cannot figure out which one is wrong.
    
    Angie
    Reply
    - Charles
      
      August 3, 2020 at 10:05 am
      
      Hello Angie,
      If you email me an EXcel file with your data I will take a look at it.
      Charles
      Reply
Mercedes

January 16, 2020 at 5:15 pm

Thank you very much for your help!
I came accross this paper with tables for W. Just wanted to share 🙂
Rudolph S. Parrish (1992) New tables of coefficients and percentage points for the w test for normality, Journal of Statistical Computation and Simulation, 41:3-4, 169-185,
DOI: 10.1080/00949659208811399
https://sci-hub.tw/10.1080/00949659208811399
Thank you again!
Reply
- Charles
  
  January 16, 2020 at 7:30 pm
  
  Hello Mercedes,
  Thank you very much for sharing this. So far, I haven’t been able to connect to the site, but I will try again tomorrow.
  Charles
  Reply
  - Mercedes
    
    January 20, 2020 at 2:23 pm
    
    It is a link to the paper trough Sci-hub, but maybe you cannot connect from your country. Sometimes the links do not work everywhere.
    Reply
    - Mercedes
      
      January 20, 2020 at 2:24 pm
      
      Sorry through 🙂
      Reply
Vickey

January 10, 2020 at 2:53 am

Hi Charles,
Thank you for your sharing. I have some questions about the normality test in excel.
I do the normality test in excel and spss. But the w value and p value are different.
In excel: w value= 0.953, p value=0.367, while in spss: statistic value=0.952, p value=0.273.
I don’t understand why they are different?
Hope your reply!
thanks
Reply
- Charles
  
  January 10, 2020 at 10:37 am
  
  See the response to your previous comment.
  Charles
  Reply
Vickey

January 9, 2020 at 5:14 pm

Hello Charles,
Thanks for your sharing. It’s really help me a lot.
While I have a problem.
I do the shapiro-wilk test in excel and spss. But the w values are not equal.
my example: n=25, w value calculated by excel is 0.953, while w value calculated by spss is 0.952, and also the p value is not equal, I used Linear Interpolation to calculate, and the p value calculated by linear interpolation is 0.367, but the p value in spss is 0.273.
I don’t know why they are not equal.
Reply
- Charles
  
  January 10, 2020 at 10:36 am
  
  That the W value is different by .001 is not so surprising since some sort of approximation is used. The difference in p-values is likely due to the choice of interpolation techniques. The Real Statistics software (for SWPROB and SWTEST) doesn’t use linear interpolation and in fact returns a value of .293. This too is an estimate. I don’t know whether the SPSS or Real Statistics estimate is better, but both give values that support the assumption of normality.
  Charles
  Reply
Daman

April 19, 2019 at 12:21 pm

Hello Charles,

I have three queries:
1. I am working on three variables EI, CSS and PT(1 independent , 2 dependent
). do i need to to check the normality in totality or of individual construct.
2. After calculating normality using SW test (N=551) EI sig=.054, CSS sig=.056 and PT=.251. plz suggest should i go with it or drop.
3. When i am calculating the same in totality EI+CSS+PT sig=.213

also checked for skweness and kurtosis values they are falling the acceptable limits.
if possible kindly give some references too.
Plz throw some light and give ur suggestions
Reply
- Charles
  
  April 19, 2019 at 4:15 pm
  
  Hello Daman,
  1. It depends on what hypothesis you are testing and what test you are using.
  2. The first two are marginal, but probably close enough. The third is clearly out of range for normality. Some tests are pretty robust to violations of normality and so it depends on the shape of the distribution as to whether you can still use that test.
  3. Probably not normal, but it is unlikely that your test will require this measurement.
  4. Depends on whether you are saying “falling into the acceptable limits” or “failing”. If skewness and kurtosis are falling within the acceptable limits, then it is likely that your data is sufficiently normally distributed. See
  d’Agostino-Pearson Test
  5. References: See the tutorial at Testing for Normality
  Charles
  Reply
  - Candy
    
    February 15, 2020 at 8:30 am
    
    Dear Charles,
    Thanks for your great work!
    I am new to the data analysis function in excel. I encountered one problem when performing analysis. I have a set of data with sample size of 239 and the p-value of Shapiro-Wilk test displaced is non-numerical (e.g. 6.08116E-08). However, the p-value of Pearson test displaced normally. I am using excel professional plus 2010 version. Could you help me to find out the cause of the problem? Thanks!
    Reply
    - Charles
      
      February 15, 2020 at 9:04 am
      
      Hello Candy,
      6.08116E-08 is equivalent to .0000000608116, which is a very small number.
      Charles
      Reply
skkoo

April 12, 2019 at 9:52 pm

pls help me. i dont understand how p value calculated? in my case w=0.957575962, that value between 0.9 and 0.95 (n=13). p value is how much? pls tell me method that is how calculate p value thanks
Reply
- Charles
  
  April 14, 2019 at 10:50 am
  
  Hello Soko,
  Based on the table at https://real-statistics.com/statistics-tables/shapiro-wilk-table/ if W = .974 then the p-value = .90, while if W = .979 then the p-value = .95. Since W = .957575962 is between W = .945 and W = .974, the p-value for your test is between .50 and .90, probably a lot closer to .50 than .90 since .957575962 is closer to .945 than to .974.
  Assuming that you have set your significance level at alpha = .05, no matter which value between p-value .50 and .90 you choose, you don’t have a significant result (since any such value is much higher than .05) and so you are safe to assume that your data is normally distributed.
  To get a more exact result for the p-value you can use interpolation. The various approaches are described on the following webpage:
  https://real-statistics.com/statistics-tables/interpolation/
  I believe that the SWTEST function in the Real Statistic Resource Pack uses log interpolation, but the results using linear regression (the simplest type) will give good enough results.
  Charles
  Reply
Matt

March 28, 2019 at 11:09 am

Hi, I may have missed this small detail, but maybe you would be kind enough to give me some help… I’m using R to calculate the SW test on my data which is 21 and 25 samples. However, I’m having a hard time figuring out how to actually report the results in my paper… is there a good protocol/precedent/format that makes it sound nice and succinct? (I think this is what Sundar was asking also.)
Do I just say something like: After running a SW test for normality W=0.96, p = 0.41, there is no indication that the data set is not normally distributed. (Do I need to included degrees of freedom, or some other #s in there?)
Thanks.

From R:
> shapiro.test(eAp)
Shapiro-Wilk normality test
data: eAp
W = 0.95957, p-value = 0.4059
Reply
- Charles
  
  March 28, 2019 at 3:49 pm
  
  Matt,
  I don’t know whether there is an approved approach. I would simply say that based on the Shapiro-Wilk test, the normality assumption is met. If you want you can insert (p = 0.41).
  Charles
  Reply
Statistical Bill

January 17, 2019 at 6:22 am

Come on Charles answer me.

So, which table is better with small samples, the original or the extended?
Reply
- Charles
  
  January 17, 2019 at 11:36 am
  
  Probably the original table, but the results should be similar.
  Charles
  Reply
  - Statistical bill
    
    January 18, 2019 at 5:38 am
    
    I agree; however, in your example here-with 12 samples-they aren’t very close. If you’re going to uses exponential estimates to expand Shapiro’s table, I think you need at least 6 exponentials to do a proper job. The worst is small samples.
    
    When using exponential estimates, Excels limit appears to be about 6 exponentials before the 18 digit precision fails.
    
    prof bill -btw, I really appreciate your Excel examples and list your links to my computer wise students. There’s nothing like your examples any where on the internet. thx Dr. Dude!
    Reply
    - Charles
      
      January 18, 2019 at 8:14 am
      
      I am pleased that you and your students are getting value from the Real Statistics website and examples.
      Yes, for small samples, the original version should be better.
      Charles
      Reply
Mai78

October 24, 2018 at 4:49 pm

Worked like a charm! Thanks for the explanation and resources!
Reply
Martina

September 23, 2018 at 6:39 pm

Hi,
I don’t know how to calculate b. There is a specific formula in excel?
Thanx!
Reply
- Charles
  
  September 24, 2018 at 9:23 pm
  
  Martina,
  =SUM(I5:I10)
  Charles
  Reply
Fernando lopes

August 20, 2018 at 3:59 am

Hi Dear from brazil ,

My name is Fernando , thaks for explanation about normality test shapiro wilk , I use it for methods validation in phamaceutical industry ,

I´d like to know how you found the p- value in excel for shapiro wilk ?

best regarding Thank you for your help in this matter
Reply
- Charles
  
  August 20, 2018 at 9:02 am
  
  Fernando,
  Thank you for your kind remarks.
  The p-value comes from the table shown on the following webpage:
  https://real-statistics.com/statistics-tables/shapiro-wilk-table/
  This based on the work done by Shapiro-Wilk.
  Charles
  Reply
Daniel

August 14, 2018 at 11:31 pm

Hi,

I am attempting to use the SWTEST and/or SWPROB functions described above after installing your RealStatistics add-in. Unfortunately, I am receiving errors (The SHAPIRO function works fine, though). I have screenshots of the errors, however, I am unable to paste them into this message. Please advise.
Reply
- Charles
  
  August 15, 2018 at 9:18 am
  
  Daniel,
  If you send me an Excel file with your data and test results (at least until you get the error message), I will try to figure out what is going on.
  Charles
  Reply
Sundar

July 3, 2018 at 10:29 am

Sir,
I have result Shapiri-wilk test analysis statistics and P-value . My result is 0.-19 and P-value is 0.18. Then what solution is this result. Please kindly reply to How is write interpretation.
Reply
Sundar rajan

July 3, 2018 at 9:16 am

Sir,
I have result Shapiri-wilk test analysis statistics and P-value . My result is 0.-19 and P-value is 0.18. Then what solution is this result. Please kindly reply
Reply
- Charles
  
  July 3, 2018 at 9:54 am
  
  Sundar,
  As explained in Example 1, since p = .19 > .05 = alpha, the result indicates that the normality assumption is satisfied.
  In your comment you say that you got a result of 0.-19. I don’t understand what this means.
  Charles
  Reply
  - Sundar
    
    July 3, 2018 at 10:31 am
    
    How is write interpretation. Thant only sir
    Reply
  - Sundar
    
    July 3, 2018 at 10:37 am
    
    Dear sir, run test value -1.39 and p- value 0.16 . Each value -4.95, -5.72. Sir i want this details. Run test value minus value correct or incorrect. Please tell me sir
    Reply
    - Charles
      
      July 3, 2018 at 11:05 am
      
      Sundar,
      Sorry, but I don’t understand your messages. If you send me an Excel file with your data and analysis, I will try to help you further.
      Charles
      Reply
ananas

June 21, 2018 at 9:15 pm

Hi,

Can you help me interpret this Shapiro-Wilk Statistic df Sig.
,918** 51 ,002
by age?
Reply
- Charles
  
  June 22, 2018 at 12:11 am
  
  Sorry, but I don’t know what ,918** 51 ,002 is referring to. How to interpret the results from the Shapiro Wilk test carried out by Real Statistics is explained on the webpage.
  Charles
  Reply
Giovanni

May 12, 2018 at 4:32 pm

Hi: Can I fixe a p-value=0.001 for to proof normality?
Reply
- Charles
  
  May 12, 2018 at 6:33 pm
  
  Giovanni,
  You can use alpha = .001, but generally alpha = .05 is used.
  Charles
  Reply
Patricia Padula Lopes

May 10, 2018 at 4:40 am

Could you tell the references you used?
Reply
- Charles
  
  May 10, 2018 at 7:25 am
  
  Patricia,
  The reference is to the Shapiro-Wilk paper. See the Bibliography webpage.
  Charles
  Reply
Julian Kaljuvee

March 30, 2018 at 12:54 pm

Hi Charles,

If one gets a value for W = b2/SS = 0.837 < 0.884 (with n=24) which is not in p-value tables, how would you handle that situation? Would this imply that there has been a calculation error or is automatically a reject? Many thanks for putting together this helpful web site!
Reply
- Charles
  
  March 30, 2018 at 3:30 pm
  
  Julian,
  Since the smallest value for n = 24 is .884 (at alpha = .01), this means that p-value < .01, which is usually interpreted as significantly different from normality. Charles
  Reply
Giacomo Tabarelli

July 22, 2017 at 9:43 am

Hi, could you explain me why you use that b formula instead of the “standard” formula used on wikipedia for calculate W? Is there any difference? Thanks
Reply
- Charles
  
  July 22, 2017 at 9:57 am
  
  Giacomo,
  It should be equivalent to formula shown in Wikipedia. I can’t recall whether I used the version in the original Shapiro-Wilk paper or elected to use the approach that I did to emphasize the symmetry aspect of the calculation.
  Charles
  Reply
Stefan S.

July 18, 2017 at 10:53 am

Dear Charles,
first I would like to say that the Add-in seems great however I did fail to follow your example by calculating it with the RealStat Add-in for Excel 2016.

I´m using the the “example 1” data set “age”.

Using the add-in I got:

W 0.971066437
p-value 0.921648864
alpha 0.05
normal yes

These results are different from your manual calculations which I could follow and got the same results.

Do you have any idea what the reason is?
I would love to use the add-in but I need to be sure it is working the right way.

Best regards,
Stefan
Reply
- Charles
  
  July 18, 2017 at 3:06 pm
  
  Stefan,
  There are two versions of the Shapiro-Wilk test: the original version, which is described on the referenced webpage, and Royston’s version, which is described on the webpage https://real-statistics.com/tests-normality-and-symmetry/statistical-tests-normality-symmetry/shapiro-wilk-expanded-test/
  The add-in value that you describe uses the Royston’ version. Actually, if you look at the output for W from the add-in, it will contain the formula =SHAPIRO(A4,A15). If you change the formula to =SHAPIRO(A4:A15,FALSE) you will get the value of W as calculated by Shapiro-Wilk’s original algorithm (the same is true for the p-value, which is calculated by SWTEST).
  The original version works well for smaller samples, but doesn’t support larger samples. This is the advantage of the Royston version.
  Charles
  Reply
Jared

March 15, 2017 at 8:15 pm

My W value is 1.273573913 for 22 samples. I can’t find a table that goes that high, and an online calculator gave me an error. What does this mean?
Reply
- Charles
  
  March 15, 2017 at 10:12 pm
  
  Jared,
  It could mean that you made an error in calculating W. What is the data in your sample?
  Charles
  Reply
Ulrike Kaiser

February 15, 2017 at 1:32 pm

Hi, Charles,

thank-you for your very helpful side.
My sample consists of 5 cases (i.e 37;105;110;150;216), resulting W = 0,9762. I want to do the SW-Test with a probability of error of 5%.
Do I have to compare my calculated W with W(p=0,95)=0,986 or with W(p=0,05)=0,762?
Thank you very much for your answer!

Ulrike
Reply
- Charles
  
  February 16, 2017 at 9:25 am
  
  Ulrike,
  As described on the referenced webpage, if W =.971, then p = .874 (via interpolation between .5 and .9). Since .874 > .05, then we conclude that we don’t have evidence to reject the hypothesis that the data is normally distributed.
  Another way to look at this is that if W =.971 >= .762 (the W value at .05), then the data is considered to be normally distributed.
  Charles
  Reply
  - Ulrike Kaiser
    
    February 16, 2017 at 9:46 am
    
    Thank you very much for your answer!
    Meanwhile I downloaded also your AddIn for Excel; that will help me a lot for my work! What a great offer!
    
    Best regards,
    
    Ulrike
    Reply
Moutaz

January 3, 2017 at 7:57 pm

Hi admin

This is an excellent explanation for the Shairo-Wilk’s test. This saved lots of time. However, I still have a questions in this test; how are the weight values calculated? What do the mean?

Thank you
Reply
- Charles
  
  January 4, 2017 at 8:19 am
  
  Moutaz,
  You need to read the original Shapiro-Wilk paper. See Bibliography.
  Charles
  Reply
Denis

November 26, 2016 at 9:39 pm

Thank you very much for the excellent explanation!
Reply
Marissa

November 7, 2016 at 9:25 pm

For n=4, my calculated value of W is 0.677. The smallest critical value for 0.01 when n=4 is 0.687. How do I interpret this result given that my W value isn’t even within any range given? I’ve double checked my data and don’t see any typos in my data recording or calculations.
Reply
- Charles
  
  November 8, 2016 at 7:03 am
  
  Marissa,
  This means that the p/value is less than .01
  Charles
  Reply
Marthen R Pellokila

October 27, 2016 at 4:16 pm

Thank you. It is really helpful
Reply
Magnus Friborg

June 17, 2016 at 3:06 pm

I tried this on a sample of 41. I got a W = 0,90728. According to the table, the closest value is 0,92 (p = 0,01) – none are lower with the same sample size. Do I just use this value or should some measure be taken?
Also, I need to make sure that I understand the method correctly. The p-value i get from interpolating is the actual p-value and has to be lower than a threshold value (say p = 0,05) in order to reject the null hypothesis – correct?

Thanks in advance
Reply
- Charles
  
  June 17, 2016 at 3:29 pm
  
  Magnus,
  Yes, the approach you are using is correct. Since .90728 < .92, you can deduce that p < .01. In fact, if you sue the Real Statistics formula =PROB(41,.90728) you get the p-value = .002739. Since this is much lower than .05, you do indeed reject the null hypothesis that the data is normally distributed. Charles
  Reply
  - Magnus Friborg
    
    June 21, 2016 at 12:26 pm
    
    Thank you very much.
    I have another issue though. What is more reliable (and under what conditions), QQ plot or SW-test? I seem to get a rejection of the null hypothesis using SW, but the QQ show very small devations – or so it appears to me. Is the SW test very sensitive to large (e.g. n = 40) samples?
    Reply
    - Charles
      
      June 21, 2016 at 1:56 pm
      
      Magnus,
      I find it easier to use the SW test since it is easier to interpret its results, but both are fairly accurate. Also, since most tests are fairly robust to violations of normality, either test can show whether the data is really departing from normality. Both tests can be sued with large samples.
      Charles
      Reply
JohnM

May 10, 2016 at 1:11 am

My entire population is just 30 values. Can the Shapiro-Wilk test also be applied to a population rather than just a sample?
Am I correct in assuming that it is simply a test for symmmetry? My situation is that I have hundreds of datasets of 30 values and I find that even if the dataset is symmetrical the distribution of the values can be a long way from the 68-95-99.7 probability bell-curve.
For example, for one dataset, the number of entries in 1Sd bins from -2sd to 2sd is … 7,4,13,5, which produces a SW p-value of 0.43. In contrast to this distribution the “68-95-99.7” probability curve suggests that a population of 30 should be either 5, 10, 10, 4 or 4, 10, 10, 5.
Is it good practice to identify those datasets where the distribution is a long way from 68-95-99.7? If so, how is that done?

Thanks in advance.
Reply
- Charles
  
  May 10, 2016 at 12:24 pm
  
  John,
  You can use the Shapiro-Wilk test for a population. Shapiro-Wik tests for normality not just symmetry.
  Charles
  Reply
  - JohnM
    
    May 12, 2016 at 5:29 am
    
    Thanks Charles.
    Another question that might interest other readers. I’m using your Excel method and I’ve written a Fortran subroutine to calculate the p_value. With the same input data they give the same results (as they should).
    When I put the same data into http://contchart.com/goodness-of-fit.aspx I get a different p-value for the Shapiro-Wilks test.
    Before I contact that website to ask them to check their processing, do you have any thoughts on the matter?
    Reply
    - Charles
      
      May 12, 2016 at 8:16 am
      
      John,
      I have also checked my results with other programs and they match.
      Charles
      Reply
Salman Ahmed

March 12, 2016 at 11:10 am

Can I get the idea how to do the below :
Interpolating .971026 between these value (using linear interpolation)
Reply
- Charles
  
  March 12, 2016 at 12:10 pm
  
  Salman,
  Please look at the following webpage:
  Interpolation
  Charles
  Reply
Kevin L

March 6, 2016 at 12:00 am

Thank you very much for your excellent explanation and excel workbooks!
Reply
Stefano

February 20, 2016 at 11:47 am

Dear Dr. Zaionts,

Thank you very much for your great tool.
I recently downloaded the latest Release (3.5.3) for the Mac version of Excel. In this one, the SWTEST function apparently gives a #VALUE! output with range size greater than 3. Is there a way to fix this? If not, where may I find and download a previous Release?
I thank you in advance for your attention.

Stefano
Reply
- Charles
  
  February 25, 2016 at 10:11 pm
  
  Dear Stefano,
  I don’t think I made any changes to this function since the previous release. In any case, if you send me an Excel file with your data and function results I will try to figure out what is causing this. You can send the file to my email address, which you can find at Contact Us.
  Charles
  Reply
Pri

January 26, 2016 at 11:17 am

These are the W values I have got from a raw data of response times for n=18.
1,012157199 0,996684879 0,824085184 0,960953212 1,006536182
Most of these values of W are out of range from the (n/p)table. Does that mean I have some calculation errors? If not, then how do I interpret the data?
Reply
- Charles
  
  January 27, 2016 at 10:45 am
  
  Pri,
  
  Since W = 0,824085184 is less than the smallest value in the table for n = 18 and p = .01, it just means that p < .01 Actually, I calculate that the p-value = 0,003394 using the Royston approximation that is described elsewhere on the website. This means that your data is likely not normally distributed. Similarly, W = 0,9609532124 is greater than the largest value in the table for n = 18 and p = .99. This just means that the p-value is larger than .99. This means that your data is probably normally distributed. The value W = 0,9609532124 is not in the table, but you know that it occurs between the values p = .5 and p = .9. You can interpolate (as described on the referenced webpage) to come up with an approximate p-value of .59, but in any case the value is much higher than .05, and so the random sample probably comes from a population that is normally distributed. Now the cases where W > 1 are causes for concern since I believe the value for W can’t exceed 1. There is a good chance that you have made a calculation error.
  
  Charles
  Reply
Soira

November 21, 2015 at 6:37 pm

Hello Dr. Zaiontz,
I really appreciate your examples and web page on real statistics using excel. I tried Shapiro-Wilk test on my data (n=10),however, I have got many variables, so I am testing the normality for each of the variables. So for one of the data, I got W=0.5679 and I referred the Wilk Test sheet, I could not get the P-values. Could something be wrong with my data itself? Or is there an extended table? Please help.
Thanks
Reply
- Charles
  
  November 24, 2015 at 10:55 am
  
  Soira,
  
  Since the value for W is less than the critical value at p = .01, you can conclude from the table that p-value is less than .01
  
  Alternatively, you can use the Royston version of Shapiro-Wilk test. See the webpage
  https://real-statistics.com/tests-normality-and-symmetry/statistical-tests-normality-symmetry/shapiro-wilk-expanded-test/
  
  In this case, you can calculate the p-value as SWPROB(10,.5679) = 2.3E-05.
  
  Charles
  Reply
  - Soira
    
    November 25, 2015 at 11:19 am
    
    thank you Charles
    Reply
Joana

October 23, 2015 at 1:34 pm

Hi Charles,

Thanks for the information on the website. It is really useful. However when I applied the Shapiro test to my data it gave me an error. This error does not happen for larger samples (mine is 4) like 5 or 6. Is there a limitation to the excel function that does not allow small samples to be tested with this function?

Thanks
Reply
- Charles
  
  October 23, 2015 at 3:21 pm
  
  It looks like it should work for samples of size at least 5.
  Charles
  Reply
  - Joana
    
    December 1, 2015 at 2:17 pm
    
    Hi Charles,
    
    I tried again the Shapiro test on my data and surprisingly it work for a sample size 3 but still not 4… Just thought I should let you know.
    
    Thanks for the website
    
    Joana
    Reply
    - Charles
      
      December 2, 2015 at 10:06 am
      
      Joana,
      Thanks for finding this bug.
      The original test for sample size of 4 does work (setting the second argument in the SHAPIRO or SWTEST function to False). The Royston version of the test has the bug when the sample size is 4.
      I will provide a fix in the next release.
      Thanks again for helping me improve the accuracy of the software.
      Charles
      Reply
Tony O

August 18, 2015 at 10:19 pm

I have gone through your explanation and I found very rewarding and useful. However, will appreciate an example for sample that is odd and not even like your two examples.

Regards
Reply
- Charles
  
  August 23, 2015 at 7:51 am
  
  Tony,
  The sample in the second example has an odd number of elements. The middle element is not used.
  Charles
  Reply
Jerry Oppong Adutwum

March 13, 2015 at 4:55 pm

I want to know what happens if data fails the SW test?
Is there any way out?
Reply
- Charles
  
  March 13, 2015 at 10:04 pm
  
  Jerry,
  If data is not normally distributed, then for tests that assume normality you can
  1. use a nonparametric test that doesn’t require normality
  2. transform the data so that the resulting data is sufficiently normal
  In addition, some tests that require normality (e.g. the t test) are sufficiently robust that as long as the data is symmetric the test will usually be ok (although even in these cases, the Mann-Whitney nonparametric test should give similar results).
  Charles
  Reply
Gurumani

January 11, 2015 at 3:29 am

Thank you Dr. I am learning a lot from your useful website. When I tried Real Stat for Shapir0-Wilk test for the two data given in the two examples, I get different W and p values from those given in the examples, as follows:
W=b^2/SS 0.971025924 W 0.971122526
0.5 0.943 p-value 0.922200674
0.9 0.973 alpha 0.05
p-value 0.873679 normal yes

W=b^2/SS 0.873965213 W 0.874012
0.02 0.855 p-value 0.03866
0.05 0.881 alpha 0.05
p value 0.041882692 normal no
Could you please explain why the difference? Have I committed any mistake in the calculations?
Reply
- Charles
  
  January 14, 2015 at 11:04 am
  
  I don’t know why you get different results. If you send me a spreadsheet with your calculations I will try to understand why there is a difference.
  Charles
  Reply
sundar

December 8, 2014 at 6:43 am

how is analysis durbin watson test using excel or spss software. Please tell step by step sending my email id
Reply
Swanand Rishi

October 18, 2014 at 8:35 am

The example 1 is well explained. However, my linearly interpolated value of Wc (p-value) comes out to be 0.89999 instead of 0.876681. The interpolation coeffcient is 0.075 per probability of .1, between 0.5 and 0.9. Hence for approx. diff. of 0.002 in W (0,973-0,971), p value = 0.89999. Pl. correct me if wrong.
Reply
- Charles
  
  October 20, 2014 at 8:22 pm
  
  The calculation I used was to interpolate between the table values .973 – .943 = .03 and .9 – .5 = .4. So the answer is .9 – .002/.03 * .4 = .873.
  In any case, the value is far more than .05. Note that you can get a more exact value (which doesn’t require interposlation) by using the Royston approximation, as described on the webpage https://real-statistics.com/tests-normality-and-symmetry/statistical-tests-normality-symmetry/shapiro-wilk-expanded-test/
  Charles
  Reply
Shreya

October 16, 2014 at 7:54 am

Hi Charles,
I found this webpage is very useful and it guided me so well. Thank you very much. But I would like to know something..How will you rank this test with respect to A-D and K-S test?
Shreya
Reply
- Charles
  
  October 16, 2014 at 10:54 am
  
  Hi Shreya,
  I would use SW over KS. I have not used AD and so don’t have an opinion.
  Charles
  Reply
Julien

July 7, 2014 at 10:39 pm

Hi Charles,
Thanks a lot for this web page!!

You said that the function SWTEST ignore all empty and non-numeric cells. Sure? Because if I add empty cells at the end of the range R1, the p-value is different.

Also, what is the difference between the original Shapiro-Wilk test and the Royston algorithm, and when do you one or the other? (Meaning that I don’t know if in the SWTEST I have to write “FALSE” or “TRUE”.

Thank you very much!
Julien
Reply
- Charles
  
  July 8, 2014 at 7:41 am
  
  Hi Julien,
  
  I just retested the SWTEST and SHAPIRO functions by adding empty and non-numeric cells at the beginning, end and in the middle of the range. The results are all the same. Which version of Excel are you using?
  
  If the values you are looking for are found in the table then you might as well use the original algorithm (although the results using the Royston algorithm are quite similar). Otherwise you should use the Royston algorithm. I tend to use the Royston algorithm always since in that case I don’t need to make any decisions.
  
  Charles
  Reply
  - Julien
    
    July 8, 2014 at 2:06 pm
    
    I use Microsoft Excel for Mac 2011 in English
    Reply
    - Charles
      
      July 8, 2014 at 2:58 pm
      
      Julien,
      Which version of the Real Statistics Resource Pack do you have? You can find this out by entering =VER() in any cell. If it is not one of the latest releases (Release 2.15) then this could account for the problem.
      Charles
      Reply
      - Julien
        
        July 9, 2014 at 11:38 pm
        
        Hi Charles,
        
        It’s the release 2.10.1
      - Charles
        
        July 10, 2014 at 1:13 pm
        
        Julien,
        This is the latest version of the software for the Mac, but it doesn’t contain some of the features that I have added for Windows. In particular WTEST only returns the one-tailed version of the test. You just need to double the value to get the p-value for the two-tailed test. I hope to get a new version for the Mac out soon (as soon as I can get a Mac computer to test it on).
        Charles
    - Charles
      
      July 9, 2014 at 6:46 am
      
      Julien,
      Now I understand the problem. I have not yet updated the Mac version of the software with the latest features. This is why some of the arguments don’t work and why some of the functions don’t handle missing data the same way. My problem is that I don’t have a Mac myself and need to borrow one to test and update the software.
      Charles
      Reply