Shapiro-Wilk Original Test

Basic Concepts

We present the original approach to performing the Shapiro-Wilk Test. This approach is limited to samples between 3 and 50 elements. By clicking here you can also review a revised approach using the algorithm of J. P. Royston which can handle samples with up to 5,000 (or even more).

The basic approach used in the Shapiro-Wilk (SW) test for normality is as follows:

  • Arrange the data in ascending order so that x1 ≤ … ≤ xn.
  • Calculate SS as follows:

image3595

  • If n is even, let m =  n/2, while if n is odd let m = (n–1)/2
  • Calculate b as follows, taking the ai weights from Table 1 (based on the value of n) in the Shapiro-Wilk Tables. Note that if n is odd, the median data value is not used in the calculation of b.

image3598

  • Calculate the test statistic W = b2 ⁄ SS
  • Find the value in Table 2 of the Shapiro-Wilk Tables (for a given value of n) that is closest to W, interpolating if necessary. This is the p-value for the test.

For example, suppose W = .975 and n = 10. Based on Table 2 of the Shapiro-Wilk Tables the p-value for the test is somewhere between .90 (W = .972) and .95 (W = .978). You can estimate this p-value using interpolation (see Interpolation).

Examples

Example 1: A random sample of 12 people is taken from a large population. The ages of the people in the sample are shown in column A of the worksheet in Figure 1. Is this data normally distributed?

Shapiro-Wilk test Excel

Figure 1 – Shapiro-Wilk test for Example 1

We begin by sorting the data in column A using Data > Sort & Filter|Sort (see Sorting and Filtering) or the Real Statistics QSORT function (see Sorting and Removing Duplicates), putting the results in column B. We next look up the coefficient values for n = 12 (the sample size) in Table 1 of the Shapiro-Wilk Tables, putting these values in column E.

Corresponding to each of these 6 coefficients a1,…,a6, we calculate the values x12 – x1, …, x7 – x6, where xi is the ith data element in sorted order. E.g. since x1 = 35 and x12 = 86, we place the difference 86 – 35 = 51 in cell H5 (the same row as the cell containing coefficient a1). Column I contains the product of the coefficients and difference values. E.g. cell I5 contains the formula =E5*H5. The sum of these values is b = 44.1641, which is found in cell I11 (and again in cell E14).

We next calculate SS as DEVSQ(B4:B15) = 2008.667 (cell E13). Thus W = b2SS = 44.1641^2/2008.667 = .971026 (cell E15).

p-value using interpolation

We now look for .971026 when n = 12 in Table 2 of the Shapiro-Wilk Tables and find that the p-value lies between .50 and .90. The W value for .5 is .943 and the W value for .9 is .973.

Interpolating .971026 between these values (using linear interpolation), we arrive at p-value = .873681. Since p-value = .87 > .05 = α, we retain the null hypothesis that the data are normally distributed. Since this p-value is based on linear interpolation, it is not very accurate, but the important thing is that it is much higher than the alpha value, and so we can retain the null hypothesis that the data is normally distributed.

Comparison with other tests

Example 2: Using the SW test, determine whether the data in Example 1 of Graphical Tests for Normality and Symmetry (repeated in column A of Figure 2) are normally distributed.

Shapiro-Wilk example

Figure 2 – Shapiro-Wilk test for Example 2

As we can see from the analysis in Figure 2, p-value = .0432 < .05 = α, and so we reject the null hypothesis and conclude with 95% confidence that the data are not normally distributed, which is quite different from the results using the KS test that we found in Example 2 of Kolmogorov-Smironov Test, but consistent with the QQ plot shown in Figure 5 of that webpage.

Real Statistics Support

Real Statistics Function: The Real Statistics Resource Pack contains the following functions.

SHAPIRO(R1, FALSE) = the Shapiro-Wilk test statistic W for the data in R1

SWTEST(R1, FALSE, interp) = p-value of the Shapiro-Wilk test on the data in R1

SWCoeff(n, j, FALSE) = the jth coefficient for samples of size n

SWCoeff(R1, C1, FALSE) = the coefficient corresponding to cell C1 within sorted range R1

SWPROB(n, W, FALSE, interp) = p-value of the Shapiro-Wilk test for a sample of size n for test statistic W

The functions SHAPIRO and SWTEST ignore all empty and non-numeric cells. The range R1 in SWCoeff(R1, C1, FALSE) should not contain any empty or non-numeric cells.

When performing the table lookup, the default is to use the recommended type of interpolation (interp = TRUE). To use linear interpolation, set interp to FALSE. See Interpolation for details.

For Example 1 of Chi-square Test for Normality, SHAPIRO(A4:A15, FALSE) = .874 and SWTEST(A4:A15, FALSE, FALSE) = SWPROB(15,.874,FALSE,FALSE) = .0419 (referring to the worksheet in Figure 2 of Chi-square Test for Normality).

Note that SHAPIRO(R1, TRUE), SWTEST(R1, TRUE), SWCoeff(n, j, TRUE), SWCoeff(R1, C1, TRUE), and SWPROB(n, W, TRUE) refer to the results using the Royston algorithm, as described in Shapiro-Wilk Expanded Test.

For compatibility with the Royston version of SWCoeff, when j ≤ n/2 then SWCoeff(n, j, FALSE) = the negative of the value of the jth coefficient for samples of size n found in the Shapiro-Wilk Tables. When j = (n+1)/2, SWCoeff(n, j, FALSE) = 0 and when  j > (n+1)/2, SWCoeff(n, j, FALSE) = -SWCoeff(n, n–j+1, FALSE).

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

Reference

Shapiro, S.S. & Wilk, M.B. (1965) An analysis of variance for normality (complete samples). Biometrika, Vol. 52, No. 3/4.

http://webspace.ship.edu/pgmarr/Geo441/Readings/Shapiro%20and%20Wilk%201965%20-%20An%20Analysis%20of%20Variance%20Test%20for%20Normality.pdf

174 thoughts on “Shapiro-Wilk Original Test”

  1. Hi Charles,

    I am doing a dissertation and have run outlier tests and histograms at a 90% confidence level, should my shapiro-wilk test also at 90% confidence level or the usual 95%?

    Thank you

    Reply
  2. Hello Charles,
    When performing the SW test with the data from example 1 and 2 with SPSS the results of p-value do not match, what could be the difference?

    Reply
  3. Thank you very much Charles. I follow your examples for Excel in WPS Spreadsheet and the results are fine. One question, do you write some paper about this?
    Thank you…

    Reply
    • Cleonir,
      Thank you for your kind remarks.
      I have had the intention to write a book about this and other statistics subjects but supporting this website and the Real Statistics software tends to take up the spare time that I have.
      Charles

      Reply
  4. Hi Charles,
    Thank you for the great work you are doing.
    I am dealing with a set of data that has failed the Shapiro-Wilk’s test for normality (ie. the p-value is less than 0.05). I transformed the data by evaluating its natural logarithms and conducted the Shapiro-Wilk’s test on it. (hoping that the data would be lognormal). It still fails this test for log normality. According to the literature, such data sets should be lognormally distributed. What other statistical tests can I use to test whether my data is lognormal?
    Thank you in anticipation

    Reply
    • Hello Newman,
      I assume that you found that the set of ln(x) where x is in your original sample was not normally distributed per the Shapiro-Wilk test. Unless there are a lot of ties, this tends to be the best test for normality. You could try the Anderson-Darling test, but if the original data is truly lognormally distributed, then the SW test should confirm this. What is the p-value of the test?
      Charles

      Reply
  5. Hi,

    With the example data (65,61,63,86,70,55,74,35,72,68,45,58) I get the following p-value from both R and a python function, 0.9216, as opposed to the 0.873681129
    you get. I can reproduce your value of 0.873681129
    in spreadsheet calculations. Do you have any idea why there’s a discrepancy, please?

    Thanks,

    James. Picksley

    Reply
    • Hello James,
      They are using the Royston version of the Shapiro-Wilk test. This is also the default version in Real Statistics. If the data is in range A1:A12, then SWTEST(A1:A12) = .9216, while SWTEST(A1:A12,FALSE) = .8737. The problem with the calculation in the original version of the SW test is that the interpolation that is being used is probably not so accurate.
      Charles

      Reply
      • Hi Charles,

        Thanks for the reply. I’d just worked it out and then came here to see you’d put a response up! I spotted it because I have a set of data where W in the original test is outside of the range of table 2, so I wasn’t getting a valid result, so I ran it through the extended version and got a match with R and python.

        Sorry to have wasted your time. Thanks very much for this website. I’ve found it very useful over the last few years.

        Cheers,

        James.

        Reply
  6. Primeramente saludar a Charles por su publicacion desde aqui de Bolivia. por muchos años no podia hacer correr los datos con mas de 50 muestras prueba Sapiro Willk. ahora si se puede con esta prueba ampliada. Pues llege a desifrar todas las formulas. Solo que tengo una observacion. Al final, la hipotesis de normalidad con la distribucion de probabilidad normal esta confuso con la prueba original dice no y con el ampliado dice si del ejemplo con 12 muestras. deberian coicidir ambos, en este caso la normalidad de esos datos se aceptaria o tal ves estoy equivocado. Yo acostubro trabajar al 5% de nivel de significancia y no asi con nivel de confianza que es al 95%. ya que la misma prueba te aroja a nivel de significancia. En este caso mi sale 0,078 redondeado, pero se resto de 1, seria a nivel de confianza que deberia contrastarse. Tal vez estoy muy grosero, pero muchas gracias por su aporte.

    Reply
    • Hello Ruben,
      Thank you for your kind remarks.
      Both the original and expanded versions of the Shapiro-Wilk test should give similar results. If the p-value < alpha then you have evidence that the data does not come from a normally distributed population. Charles

      Reply
  7. Charles, Can Shapiro-Wilk (SW) be used on datasets that have tied values? The State has directed me to use SW for groundwater monitoring data, and there are often tied values in groundwater data. I’ve been told there is additional data manipulation that must be done to use SW on data with tied values. I am hoping you can tell me what these additional manipulations are. Here’s an example data set: 0.075, 0.077, 0.1, 0.1, 0.1, 0.15, 0.19, 0.2, 0.2, 0.23, 0.27, 0.28, 0.28, 0.29, 0.3, 0.33, 0.34, 0.35, 0.37, 0.37, 0.44, 0.52, 0.56, 0.58. I use excel to calculate W and get W=0.9437 (without accounting for ties).
    Thank you.

    Reply
  8. Hi Charles,
    I’m testing a bunch of my data for my dissertation so I can do further analysis.
    On my last data set my W value came out being super low at 0.6927 (n=12).

    As the W values in the chart don’t go down that low does this just mean that I accept the null hypothesis and my data isn’t normally distributed?

    Reply
    • Hi Kai,
      Since W is lower than the lowest value on the table this means that p-value < .01, which means that you reject the null hypothesis that your data is normally distributed. Charles

      Reply
  9. Hi there,

    I am doing a Shapiro Wilk for n=15 data – however my w value comes out above 1. therefore i cannot continue with the calculations as shown.

    Reply
  10. Thank you very much for your help!
    I came accross this paper with tables for W. Just wanted to share 🙂
    Rudolph S. Parrish (1992) New tables of coefficients and percentage points for the w test for normality, Journal of Statistical Computation and Simulation, 41:3-4, 169-185,
    DOI: 10.1080/00949659208811399
    https://sci-hub.tw/10.1080/00949659208811399
    Thank you again!

    Reply
  11. Hi Charles,
    Thank you for your sharing. I have some questions about the normality test in excel.
    I do the normality test in excel and spss. But the w value and p value are different.
    In excel: w value= 0.953, p value=0.367, while in spss: statistic value=0.952, p value=0.273.
    I don’t understand why they are different?
    Hope your reply!
    thanks

    Reply
  12. Hello Charles,
    Thanks for your sharing. It’s really help me a lot.
    While I have a problem.
    I do the shapiro-wilk test in excel and spss. But the w values are not equal.
    my example: n=25, w value calculated by excel is 0.953, while w value calculated by spss is 0.952, and also the p value is not equal, I used Linear Interpolation to calculate, and the p value calculated by linear interpolation is 0.367, but the p value in spss is 0.273.
    I don’t know why they are not equal.

    Reply
    • That the W value is different by .001 is not so surprising since some sort of approximation is used. The difference in p-values is likely due to the choice of interpolation techniques. The Real Statistics software (for SWPROB and SWTEST) doesn’t use linear interpolation and in fact returns a value of .293. This too is an estimate. I don’t know whether the SPSS or Real Statistics estimate is better, but both give values that support the assumption of normality.
      Charles

      Reply
  13. Hello Charles,

    I have three queries:
    1. I am working on three variables EI, CSS and PT(1 independent , 2 dependent
    ). do i need to to check the normality in totality or of individual construct.
    2. After calculating normality using SW test (N=551) EI sig=.054, CSS sig=.056 and PT=.251. plz suggest should i go with it or drop.
    3. When i am calculating the same in totality EI+CSS+PT sig=.213

    also checked for skweness and kurtosis values they are falling the acceptable limits.
    if possible kindly give some references too.
    Plz throw some light and give ur suggestions

    Reply
    • Hello Daman,
      1. It depends on what hypothesis you are testing and what test you are using.
      2. The first two are marginal, but probably close enough. The third is clearly out of range for normality. Some tests are pretty robust to violations of normality and so it depends on the shape of the distribution as to whether you can still use that test.
      3. Probably not normal, but it is unlikely that your test will require this measurement.
      4. Depends on whether you are saying “falling into the acceptable limits” or “failing”. If skewness and kurtosis are falling within the acceptable limits, then it is likely that your data is sufficiently normally distributed. See
      d’Agostino-Pearson Test
      5. References: See the tutorial at Testing for Normality
      Charles

      Reply
      • Dear Charles,
        Thanks for your great work!
        I am new to the data analysis function in excel. I encountered one problem when performing analysis. I have a set of data with sample size of 239 and the p-value of Shapiro-Wilk test displaced is non-numerical (e.g. 6.08116E-08). However, the p-value of Pearson test displaced normally. I am using excel professional plus 2010 version. Could you help me to find out the cause of the problem? Thanks!

        Reply
  14. pls help me. i dont understand how p value calculated? in my case w=0.957575962, that value between 0.9 and 0.95 (n=13). p value is how much? pls tell me method that is how calculate p value thanks

    Reply
    • Hello Soko,
      Based on the table at https://real-statistics.com/statistics-tables/shapiro-wilk-table/ if W = .974 then the p-value = .90, while if W = .979 then the p-value = .95. Since W = .957575962 is between W = .945 and W = .974, the p-value for your test is between .50 and .90, probably a lot closer to .50 than .90 since .957575962 is closer to .945 than to .974.
      Assuming that you have set your significance level at alpha = .05, no matter which value between p-value .50 and .90 you choose, you don’t have a significant result (since any such value is much higher than .05) and so you are safe to assume that your data is normally distributed.
      To get a more exact result for the p-value you can use interpolation. The various approaches are described on the following webpage:
      https://real-statistics.com/statistics-tables/interpolation/
      I believe that the SWTEST function in the Real Statistic Resource Pack uses log interpolation, but the results using linear regression (the simplest type) will give good enough results.
      Charles

      Reply
  15. Hi, I may have missed this small detail, but maybe you would be kind enough to give me some help… I’m using R to calculate the SW test on my data which is 21 and 25 samples. However, I’m having a hard time figuring out how to actually report the results in my paper… is there a good protocol/precedent/format that makes it sound nice and succinct? (I think this is what Sundar was asking also.)
    Do I just say something like: After running a SW test for normality W=0.96, p = 0.41, there is no indication that the data set is not normally distributed. (Do I need to included degrees of freedom, or some other #s in there?)
    Thanks.

    From R:
    > shapiro.test(eAp)
    Shapiro-Wilk normality test
    data: eAp
    W = 0.95957, p-value = 0.4059

    Reply
    • Matt,
      I don’t know whether there is an approved approach. I would simply say that based on the Shapiro-Wilk test, the normality assumption is met. If you want you can insert (p = 0.41).
      Charles

      Reply
      • I agree; however, in your example here-with 12 samples-they aren’t very close. If you’re going to uses exponential estimates to expand Shapiro’s table, I think you need at least 6 exponentials to do a proper job. The worst is small samples.

        When using exponential estimates, Excels limit appears to be about 6 exponentials before the 18 digit precision fails.

        prof bill -btw, I really appreciate your Excel examples and list your links to my computer wise students. There’s nothing like your examples any where on the internet. thx Dr. Dude!

        Reply
        • I am pleased that you and your students are getting value from the Real Statistics website and examples.
          Yes, for small samples, the original version should be better.
          Charles

          Reply
  16. Hi Dear from brazil ,

    My name is Fernando , thaks for explanation about normality test shapiro wilk , I use it for methods validation in phamaceutical industry ,

    I´d like to know how you found the p- value in excel for shapiro wilk ?

    best regarding Thank you for your help in this matter

    Reply
  17. Hi,

    I am attempting to use the SWTEST and/or SWPROB functions described above after installing your RealStatistics add-in. Unfortunately, I am receiving errors (The SHAPIRO function works fine, though). I have screenshots of the errors, however, I am unable to paste them into this message. Please advise.

    Reply
    • Daniel,
      If you send me an Excel file with your data and test results (at least until you get the error message), I will try to figure out what is going on.
      Charles

      Reply
  18. Sir,
    I have result Shapiri-wilk test analysis statistics and P-value . My result is 0.-19 and P-value is 0.18. Then what solution is this result. Please kindly reply to How is write interpretation.

    Reply
  19. Sir,
    I have result Shapiri-wilk test analysis statistics and P-value . My result is 0.-19 and P-value is 0.18. Then what solution is this result. Please kindly reply

    Reply
    • Sundar,
      As explained in Example 1, since p = .19 > .05 = alpha, the result indicates that the normality assumption is satisfied.
      In your comment you say that you got a result of 0.-19. I don’t understand what this means.
      Charles

      Reply
    • Sorry, but I don’t know what ,918** 51 ,002 is referring to. How to interpret the results from the Shapiro Wilk test carried out by Real Statistics is explained on the webpage.
      Charles

      Reply
  20. Hi Charles,

    If one gets a value for W = b2/SS = 0.837 < 0.884 (with n=24) which is not in p-value tables, how would you handle that situation? Would this imply that there has been a calculation error or is automatically a reject? Many thanks for putting together this helpful web site!

    Reply
    • Julian,
      Since the smallest value for n = 24 is .884 (at alpha = .01), this means that p-value < .01, which is usually interpreted as significantly different from normality. Charles

      Reply
  21. Hi, could you explain me why you use that b formula instead of the “standard” formula used on wikipedia for calculate W? Is there any difference? Thanks

    Reply
    • Giacomo,
      It should be equivalent to formula shown in Wikipedia. I can’t recall whether I used the version in the original Shapiro-Wilk paper or elected to use the approach that I did to emphasize the symmetry aspect of the calculation.
      Charles

      Reply
  22. Dear Charles,
    first I would like to say that the Add-in seems great however I did fail to follow your example by calculating it with the RealStat Add-in for Excel 2016.

    I´m using the the “example 1” data set “age”.

    Using the add-in I got:

    W 0.971066437
    p-value 0.921648864
    alpha 0.05
    normal yes

    These results are different from your manual calculations which I could follow and got the same results.

    Do you have any idea what the reason is?
    I would love to use the add-in but I need to be sure it is working the right way.

    Best regards,
    Stefan

    Reply
    • Stefan,
      There are two versions of the Shapiro-Wilk test: the original version, which is described on the referenced webpage, and Royston’s version, which is described on the webpage https://real-statistics.com/tests-normality-and-symmetry/statistical-tests-normality-symmetry/shapiro-wilk-expanded-test/
      The add-in value that you describe uses the Royston’ version. Actually, if you look at the output for W from the add-in, it will contain the formula =SHAPIRO(A4,A15). If you change the formula to =SHAPIRO(A4:A15,FALSE) you will get the value of W as calculated by Shapiro-Wilk’s original algorithm (the same is true for the p-value, which is calculated by SWTEST).
      The original version works well for smaller samples, but doesn’t support larger samples. This is the advantage of the Royston version.
      Charles

      Reply
  23. My W value is 1.273573913 for 22 samples. I can’t find a table that goes that high, and an online calculator gave me an error. What does this mean?

    Reply
  24. Hi, Charles,

    thank-you for your very helpful side.
    My sample consists of 5 cases (i.e 37;105;110;150;216), resulting W = 0,9762. I want to do the SW-Test with a probability of error of 5%.
    Do I have to compare my calculated W with W(p=0,95)=0,986 or with W(p=0,05)=0,762?
    Thank you very much for your answer!

    Ulrike

    Reply
    • Ulrike,
      As described on the referenced webpage, if W =.971, then p = .874 (via interpolation between .5 and .9). Since .874 > .05, then we conclude that we don’t have evidence to reject the hypothesis that the data is normally distributed.
      Another way to look at this is that if W =.971 >= .762 (the W value at .05), then the data is considered to be normally distributed.
      Charles

      Reply
  25. Hi admin

    This is an excellent explanation for the Shairo-Wilk’s test. This saved lots of time. However, I still have a questions in this test; how are the weight values calculated? What do the mean?

    Thank you

    Reply
  26. For n=4, my calculated value of W is 0.677. The smallest critical value for 0.01 when n=4 is 0.687. How do I interpret this result given that my W value isn’t even within any range given? I’ve double checked my data and don’t see any typos in my data recording or calculations.

    Reply
  27. I tried this on a sample of 41. I got a W = 0,90728. According to the table, the closest value is 0,92 (p = 0,01) – none are lower with the same sample size. Do I just use this value or should some measure be taken?
    Also, I need to make sure that I understand the method correctly. The p-value i get from interpolating is the actual p-value and has to be lower than a threshold value (say p = 0,05) in order to reject the null hypothesis – correct?

    Thanks in advance

    Reply
    • Magnus,
      Yes, the approach you are using is correct. Since .90728 < .92, you can deduce that p < .01. In fact, if you sue the Real Statistics formula =PROB(41,.90728) you get the p-value = .002739. Since this is much lower than .05, you do indeed reject the null hypothesis that the data is normally distributed. Charles

      Reply
      • Thank you very much.
        I have another issue though. What is more reliable (and under what conditions), QQ plot or SW-test? I seem to get a rejection of the null hypothesis using SW, but the QQ show very small devations – or so it appears to me. Is the SW test very sensitive to large (e.g. n = 40) samples?

        Reply
        • Magnus,
          I find it easier to use the SW test since it is easier to interpret its results, but both are fairly accurate. Also, since most tests are fairly robust to violations of normality, either test can show whether the data is really departing from normality. Both tests can be sued with large samples.
          Charles

          Reply
  28. My entire population is just 30 values. Can the Shapiro-Wilk test also be applied to a population rather than just a sample?
    Am I correct in assuming that it is simply a test for symmmetry? My situation is that I have hundreds of datasets of 30 values and I find that even if the dataset is symmetrical the distribution of the values can be a long way from the 68-95-99.7 probability bell-curve.
    For example, for one dataset, the number of entries in 1Sd bins from -2sd to 2sd is … 7,4,13,5, which produces a SW p-value of 0.43. In contrast to this distribution the “68-95-99.7” probability curve suggests that a population of 30 should be either 5, 10, 10, 4 or 4, 10, 10, 5.
    Is it good practice to identify those datasets where the distribution is a long way from 68-95-99.7? If so, how is that done?

    Thanks in advance.

    Reply
    • John,
      You can use the Shapiro-Wilk test for a population. Shapiro-Wik tests for normality not just symmetry.
      Charles

      Reply
      • Thanks Charles.
        Another question that might interest other readers. I’m using your Excel method and I’ve written a Fortran subroutine to calculate the p_value. With the same input data they give the same results (as they should).
        When I put the same data into http://contchart.com/goodness-of-fit.aspx I get a different p-value for the Shapiro-Wilks test.
        Before I contact that website to ask them to check their processing, do you have any thoughts on the matter?

        Reply
  29. Dear Dr. Zaionts,

    Thank you very much for your great tool.
    I recently downloaded the latest Release (3.5.3) for the Mac version of Excel. In this one, the SWTEST function apparently gives a #VALUE! output with range size greater than 3. Is there a way to fix this? If not, where may I find and download a previous Release?
    I thank you in advance for your attention.

    Stefano

    Reply
    • Dear Stefano,
      I don’t think I made any changes to this function since the previous release. In any case, if you send me an Excel file with your data and function results I will try to figure out what is causing this. You can send the file to my email address, which you can find at Contact Us.
      Charles

      Reply
  30. These are the W values I have got from a raw data of response times for n=18.
    1,012157199 0,996684879 0,824085184 0,960953212 1,006536182
    Most of these values of W are out of range from the (n/p)table. Does that mean I have some calculation errors? If not, then how do I interpret the data?

    Reply
    • Pri,

      Since W = 0,824085184 is less than the smallest value in the table for n = 18 and p = .01, it just means that p < .01 Actually, I calculate that the p-value = 0,003394 using the Royston approximation that is described elsewhere on the website. This means that your data is likely not normally distributed. Similarly, W = 0,9609532124 is greater than the largest value in the table for n = 18 and p = .99. This just means that the p-value is larger than .99. This means that your data is probably normally distributed. The value W = 0,9609532124 is not in the table, but you know that it occurs between the values p = .5 and p = .9. You can interpolate (as described on the referenced webpage) to come up with an approximate p-value of .59, but in any case the value is much higher than .05, and so the random sample probably comes from a population that is normally distributed. Now the cases where W > 1 are causes for concern since I believe the value for W can’t exceed 1. There is a good chance that you have made a calculation error.

      Charles

      Reply
  31. Hello Dr. Zaiontz,
    I really appreciate your examples and web page on real statistics using excel. I tried Shapiro-Wilk test on my data (n=10),however, I have got many variables, so I am testing the normality for each of the variables. So for one of the data, I got W=0.5679 and I referred the Wilk Test sheet, I could not get the P-values. Could something be wrong with my data itself? Or is there an extended table? Please help.
    Thanks

    Reply
  32. Hi Charles,

    Thanks for the information on the website. It is really useful. However when I applied the Shapiro test to my data it gave me an error. This error does not happen for larger samples (mine is 4) like 5 or 6. Is there a limitation to the excel function that does not allow small samples to be tested with this function?

    Thanks

    Reply
      • Hi Charles,

        I tried again the Shapiro test on my data and surprisingly it work for a sample size 3 but still not 4… Just thought I should let you know.

        Thanks for the website

        Joana

        Reply
        • Joana,
          Thanks for finding this bug.
          The original test for sample size of 4 does work (setting the second argument in the SHAPIRO or SWTEST function to False). The Royston version of the test has the bug when the sample size is 4.
          I will provide a fix in the next release.
          Thanks again for helping me improve the accuracy of the software.
          Charles

          Reply
  33. I have gone through your explanation and I found very rewarding and useful. However, will appreciate an example for sample that is odd and not even like your two examples.

    Regards

    Reply
    • Jerry,
      If data is not normally distributed, then for tests that assume normality you can
      1. use a nonparametric test that doesn’t require normality
      2. transform the data so that the resulting data is sufficiently normal
      In addition, some tests that require normality (e.g. the t test) are sufficiently robust that as long as the data is symmetric the test will usually be ok (although even in these cases, the Mann-Whitney nonparametric test should give similar results).
      Charles

      Reply
  34. Thank you Dr. I am learning a lot from your useful website. When I tried Real Stat for Shapir0-Wilk test for the two data given in the two examples, I get different W and p values from those given in the examples, as follows:
    W=b^2/SS 0.971025924 W 0.971122526
    0.5 0.943 p-value 0.922200674
    0.9 0.973 alpha 0.05
    p-value 0.873679 normal yes

    W=b^2/SS 0.873965213 W 0.874012
    0.02 0.855 p-value 0.03866
    0.05 0.881 alpha 0.05
    p value 0.041882692 normal no
    Could you please explain why the difference? Have I committed any mistake in the calculations?

    Reply
    • I don’t know why you get different results. If you send me a spreadsheet with your calculations I will try to understand why there is a difference.
      Charles

      Reply
  35. The example 1 is well explained. However, my linearly interpolated value of Wc (p-value) comes out to be 0.89999 instead of 0.876681. The interpolation coeffcient is 0.075 per probability of .1, between 0.5 and 0.9. Hence for approx. diff. of 0.002 in W (0,973-0,971), p value = 0.89999. Pl. correct me if wrong.

    Reply
  36. Hi Charles,
    I found this webpage is very useful and it guided me so well. Thank you very much. But I would like to know something..How will you rank this test with respect to A-D and K-S test?
    Shreya

    Reply
  37. Hi Charles,
    Thanks a lot for this web page!!

    You said that the function SWTEST ignore all empty and non-numeric cells. Sure? Because if I add empty cells at the end of the range R1, the p-value is different.

    Also, what is the difference between the original Shapiro-Wilk test and the Royston algorithm, and when do you one or the other? (Meaning that I don’t know if in the SWTEST I have to write “FALSE” or “TRUE”.

    Thank you very much!
    Julien

    Reply
    • Hi Julien,

      I just retested the SWTEST and SHAPIRO functions by adding empty and non-numeric cells at the beginning, end and in the middle of the range. The results are all the same. Which version of Excel are you using?

      If the values you are looking for are found in the table then you might as well use the original algorithm (although the results using the Royston algorithm are quite similar). Otherwise you should use the Royston algorithm. I tend to use the Royston algorithm always since in that case I don’t need to make any decisions.

      Charles

      Reply
        • Julien,
          Which version of the Real Statistics Resource Pack do you have? You can find this out by entering =VER() in any cell. If it is not one of the latest releases (Release 2.15) then this could account for the problem.
          Charles

          Reply
          • Julien,
            This is the latest version of the software for the Mac, but it doesn’t contain some of the features that I have added for Windows. In particular WTEST only returns the one-tailed version of the test. You just need to double the value to get the p-value for the two-tailed test. I hope to get a new version for the Mac out soon (as soon as I can get a Mac computer to test it on).
            Charles

        • Julien,
          Now I understand the problem. I have not yet updated the Mac version of the software with the latest features. This is why some of the arguments don’t work and why some of the functions don’t handle missing data the same way. My problem is that I don’t have a Mac myself and need to borrow one to test and update the software.
          Charles

          Reply

Leave a Comment