Basic Concepts
A key assumption in regression is that the error terms are independent of each other. On this webpage, we present a simple test to determine whether there is autocorrelation (aka serial correlation), i.e. where there is a (linear) correlation between the error term for one observation and the next. This is especially relevant with time series data where the data are sequenced by time.
The Durbin-Watson test uses the following statistic:
where the ei = yi – ŷi are the residuals, n = the number of elements in the sample, and k = the number of independent variables.
d takes on values between 0 and 4. A value of d = 2 means there is no autocorrelation. A value substantially below 2 (and especially a value less than 1) means that the data is positively autocorrelated, i.e. on average, a data element is close to the subsequent data element. A value of d substantially above 2 means that the data is negatively autocorrelated, i.e. on average a data element is far from the subsequent data element.
Example
Example 1: Find the Durbin-Watson statistic for the data in Figure 1.
Figure 1 – Durbin-Watson Test
The d statistic (cell J3) is 0.725951, but what does this tell us about the autocorrelation?
Hypothesis Testing
The Durbin-Watson statistic can also be tested for significance using the Durbin-Watson Table. For each value of alpha (.01 or .05) and each value of the sample size n (from 6 to 2000) and each value of the number of independent variables k (from 1 to 20), the table contains a lower and upper critical value (dL and dU).
Since most regression problems involving time-series data show a positive autocorrelation, we usually test the null hypothesis H0: the autocorrelation ρ ≤ 0 (which we believe is ρ = 0) versus the alternative hypothesis H1: ρ > 0, using the following criteria:
If d < dL reject H0 : ρ ≤ 0 (and so accept H1 : ρ > 0)
If d > dU do not reject H0 : ρ ≤ 0 (presumably ρ = 0)
If dL < d < dU test is inconclusive
Note that if d > 2 then we should test for negative autocorrelation instead of positive autocorrelation. To do this, simply test 4–d for positive autocorrelation as described above.
For Example 1, with α = .05, we know that n = 11 and k = 2. From the Durbin-Watson Table, we see that dL = .75798 and dU = 1.60439. Since d = 0.72595 < .75798 = dL, we reject the null hypothesis, and conclude that there is a significant positive autocorrelation.
Worksheet Functions
Real Statistics Function: The following two versions of the DURBIN function are available in the Real Statistics Resource Pack.
DURBIN(R1) = the Durbin-Watson statistic d where R1 is a column vector containing residuals
DURBIN(R1, R2) = the Durbin-Watson statistic d where R1 is a m × n range containing X data and R2 is an m × 1 column vector containing Y data.
DLowerCRIT(n, k, α, h) = lower critical value of the Durbin-Watson statistic for samples of size n (6 to 2,000) based on k independent variables (1 to 20) for α = .01, .025 or .05 (default). If h = TRUE (default) harmonic interpolation is used; otherwise linear interpolation is used.
DUpperCRIT(n, k, α, h) = upper critical value of the Durbin-Watson statistic for samples of size n (6 to 2,000) based on k independent variables (1 to 20) for α = .01, .025 or .05 (default). If h = TRUE (default) harmonic interpolation is used; otherwise linear interpolation is used.
Actually, the DURBIN function is an array function, described as follows:
DURBIN(R1, R2, lab, α): returns a column range with the values d, dL, dU and sig where R1 is a m × n range containing X data and R2 is an m × 1 column vector containing Y data,
DURBIN (R1, k, lab, α): returns a column range with the values d, dL, dU and sig where R1 is a column vector containing residuals and k = the # of independent variables (default = 2)
Here α = .01, .025 or .05 (default). If lab = TRUE (default = FALSE) then an extra column of labels is added to the output.
Note that the functions DLowerCRIT and DUpperCRIT support a much larger range of values of n than the Durbin-Watson Table. Also these functions support α = .01, .025 and .05 (actually any value between .01 and .05 by using interpolation), while the table only provides values for α = .01 and .05.
Observation: Referring to Figure 1, we can calculate the statistic d = 0.72595 using either one of the formulas: = DURBIN(G4:G14) or =DURBIN(B4:C14,D4:D14). In fact, if we highlight the range I3:J6 and enter either of these formulas and then press Ctrl-Shft-Enter the result will be the same as shown in range I3:J6 of Figure 1.
Data Analysis Tool
Real Statistics Data Analysis Tool: The Linear Regression data analysis tool provided by the Real Statistics Resource Pack also supports the Durbin-Watson Test as described next.
To conduct the test in Example 1, press Ctrl-m and double click on the Linear Regression data analysis tool. Now fill in the dialog box that appears as shown in Figure 2.
Figure 2 – Durbin-Watson data analysis
The output is similar to that generated by the formula
=DURBIN(B4:C14,D4:D14,TRUE,O24)
Test for Large Samples
For n > 200, you can test the null hypothesis that there is no autocorrelation by noting that based on this null hypothesis the following test statistic has a standard normal distribution
where df = n – k – 1 (n = sample size and k = # of independent variables). Thus, if ABS(z) > NORM.S.INV(α/2), then you reject the null hypothesis and conclude that there is autocorrelation.
For sample sizes 200 ≥ n ≥ 50, you can use the following test statistic instead
where c = 0.0026991244575048159 + 3.11878179323157 / df
References
Wikipedia (2018) Durbin-Watson statistic
https://en.wikipedia.org/wiki/Durbin%E2%80%93Watson_statistic
Lee, M-Y. (2016) On the Durbin Watson statistic based on a Z-test in large samples, Int. J. Computational Economics and Econometrics, Vol. 6, No. 1, pp.114-121.
https://ideas.repec.org/a/ids/ijcome/v6y2016i1p114-121.html
Hey I am realy glad and thankfull that I found your site because you explain these all so simple that even I can fathom advanced statistics. I have a question. I have many ideas for econometric models using Regressions but I am not sure at which moment should I to check which kind of Regression is apriopriate for certain model. Is there a procedure? Like should I start each model from linear form and than based on mistakes analysis check other forms? (Assuming that general mistake is to high in the begining) or maybe I should try to find out best Regression in the begining of modeling? I appreciate your help
Hello Peter,
Excellent question. Unfortunately, I don’t have a check list or flowchart that helps determine which type of forecasting model to use. I will look into this for a future addition.
Charles
Thanks!
Charles,
What can be done if there is significant autocorrelation, as shown in this case by the Durbin-Watson test as well as the runstest? The regression model seems good, given the significant intercept (p=4.6E-05) and slopes for Rainfall (p=3.6E-06) and Temp (p=0.010). In other words, how serious is significant autocorrelation on the reliability of a model? Thanks.
Hello Dave,
If a significant autocorrelation is detected, you can use the FGLS, Cochrane-Orcutte, or Newey-West approaches to deal with autocorrelation, See the following web page for more details:
https://real-statistics.com/multiple-regression/autocorrelation/
Charles
Thanks Charles. The Newey-West approach makes the most sense to me.
Hello, I am Pipit.
I need a little bit of help with the Durbin-Watson test.
n = 268
k = 4
durbin-watson test: d = 1,326
Hello Pipit,
The lower critical value for n = 250 (which is pretty close to 268) and k = 4 is 1.676. Your d is less than this value.
Charles
Charles
Will you be implementing a tool for the Durbin H statistic?
Hello Professor,
I have a large datasets of sample size ranging from 10 to 20k (n = 10k to 20k) and k = 5 (including 3 dummies). For the sake of simplicity of interpretation, i drop the intercept. After i run weighted least regression(wls), i find negative atutocorrelation (dw statistic ranging from 2 to 2.1). For wls, i find we cannot apply Cochrane-Orcutt procedure. Do you have any suggestion(s) to address serial correlation issue?
Regards,
Karthik
Hello Karthik,
Why can’t you use the Cochrane-Orcutt procedure?
Charles
Hello Professor,
Thank you for your reply.
I use R package. However, it appears that existing packages do not support Cochrane-Orcutt estimation for weighted least square regressions.
“Error in lmtest::dwtest(reg) : weighted regressions are not supported”.
Karthik
Cochrane-Orcutt is supported by Real Statistics. I don’t use R and so don’t know how you access this capability in R.
Charles
Hello, I need a little bit of help with a Durbin-Watson test!
n=132
k=12
durbin watson test: d=1,16
Do i have positive or no autocorrelation?
Hello Ebba,
You need to use the table of critical values described on this webpage. You can find the table at
https://www.real-statistics.com/statistics-tables/durbin-watson-table/
You need to interpolate between the values n = 100 and n = 150.
Charles