ARIMA Differencing

Basic Concepts

In order to create a stationary process, differencing may be necessary. For example, the graph of closing Dow Jones indices for October 2015 (see Example 1 of Stationary Process), as shown in Figure 1, clearly shows an increasing trend. Differencing is a way to eliminate such trends.

Dow-Jones diferencing

Figure 1 – Dow Jones Indices for October 2015

First-order differencing addresses linear trends, and employs the transformation zi = yi – yi-1. Second-order differencing addresses quadratic trends and employs a first-order difference of a first-order difference, namely zi = (yi – yi-1) – (yi-1 – yi-2), which is equivalent to zi = yi – 2yi-1+ yi-2.

Taking first-order differences for the data in Figure 1 results in the chart on the right. The trend seems to have been eliminated.

ARIMA

An autoregressive integrated moving average (ARIMA) process (aka a Box-Jenkins process) adds differencing to an ARMA process. An ARMA(p,q) process with d-order differencing is called an ARIMA(p.d,q) process. Thus, for example, an ARIMA(2,1,0) process is an AR(2) process with first-order differencing.

It is important not to over-difference since this can cause you to use an incorrect model. Some rules-of-thumb indicating that you may have differenced too many times are:

  • The autocorrelation of a differenced series is less than -.5
  • Differencing increases the variance

Rules-of-thumb

An AR(p) or MA(q) process has a unit root if the sum of the non-constant coefficients is 1.

Additional rules-of-thumb:

  • If an AR(p) process has a unit root then the level of differencing should be increased
  • If an MA(q) process has a unit root then the level of differencing should be decreased

Assume that column A contains a time series of size n starting in cell A1. Now suppose we want to place the 1st order differences (of size n-1) in column B starting in cell B2, the 2nd order differences (of size n-2) in column C starting in cell C3, and so on with the 7th order differences (of size n-7) in column H starting in cell H8, then the formulas that need to be used in these starting cells are as follows:

  • B2: A2-A1
  • C3: A3-2*A2+A1
  • D4: A4-3*A3+3*A2-A1
  • E5: A5-4*A4+6*A3-4*A2+A1
  • F6: A6-5*A5+10*A4-10*A3+5*A2-A1
  • G7: A7-6*A6+15*A5-20*A4+15*A3-6*A2+A1
  • H8: A8-7*A7+21*A6-35*A5+35*A4-21*A3+7*A2-A1

For each column, you need to highlight the range down to the nth cell in that column and press Ctrl-D to get the other values. Note that for the 7th order difference, the coefficients used are C(7,0) = 1, C(7,1) = 7, C(7,2) = 21, etc.

Worksheet Function

Real Statistics Function: The Real Statistics Resource Pack provides the following array function.

ADIFF(R1, d) – takes the time series in the n × 1 range R1 and outputs an n–d × 1 range containing the data in R1 differenced d times

Example 1: Find the 1st, 2nd, 3rd, and 4th differences for the data in column A of Figure 1.

Differencing ARIMA

Figure 1 – Differencing

Here cell B4 contains the formula =A5-A4, cell C4 contains the formula =B5-B4 (or A6-2*A5+A4), cell D4 contains the formula =C5-C4 (or A7-3*A6+3*A5-A4) and cell E4 contains the formula =D5-D4 (or A8-4*A7+6*A6-4*A5+A4).

Range G4:G22 contains the array formula =ADIFF($A$4:$A$23,G3). If we highlight the range G4:J22 and press Ctrl-R, we get the result shown on the right side of Figure 1.

References

Nau, R. (2020) Identifying the order of differencing in an ARIMA model
https://people.duke.edu/~rnau/411arim2.htm

12 thoughts on “ARIMA Differencing”

  1. Hello.

    Thank you for this amazing explanation, it has really helped me. I wanted to find out on how one can conduct a test for stationarity using an equation such as this one: Yt = -0.48Yt-1 + Ut + 0.72Ut-1.

    Thank you.

    Reply
  2. ” Thus, for example, an ARIMA(2,0,1) process is an AR(2) process with first-order differencing.”

    I thought an ARIMA(2,0,1) process was an AR(2) and MA(1) process, and 0 order/degree of differencing is needed for the series to be stationary.

    If you wrote “an AR(2) process with first-order differencing.” , then isn’t it just ARIMA(2,1,0) ???

    Sorry I am a bit confused, please help. Thank you

    Reply
    • Hello Eason,
      Yes, you are correct. I used ARIMA(p,q,d) instead of ARIMA(p,d,q). I have now changed this on this webpage and a few other webpages. Thanks for catching this mistake.
      Charles

      Reply
  3. Hi,

    In your statement above, you mentioned:
    “zi = (yi – yi-1) – (yi-1 – yi-2), which is equivalent to zi = yi – yi-2”

    Shouldn’t it be:
    zi = yi – 2yi-1 + yi-2?

    Kindly correct me if I’m wrong.

    Thanks,
    Jiawei

    Reply
  4. When i checked in excel for Range G4:G22 contains the array formula =ADIFF($A$4:$A$23,G3). It is retriving all values as 2.690575.

    Values are not coming as below:
    2.690575
    -0.901482
    1.212705
    2.015852
    1.209541
    -0.197216
    -1.046158
    2.044927
    0.89878
    -0.92262
    0.05102
    0.64771
    -0.73725
    0.26199
    -0.653847
    1.513367
    1.44655
    0.75421
    -1.17511

    Reply
  5. Thanks for your clear explanation. It is really helpful to me as a student desiring to fully understand the way of implementing the ARIMA model using toy data, without exploiting other commercial solvers such as STATA, R and so on. Because I wasn’t sure whether I properly formulated ARIMA models using obtained coefficients or not, thus, I looked forward to observing the detailed steps for forecasting something as presented in your website. Thank you !

    Reply
  6. Hi Charles, thanks for your tools and explain, I’m looking long for it. Some questions for differencing:
    Your example is the N-order difference for trend data, if there’s a data set with trend and seasonal, I think may need to use K-step difference as well, but how can I identify the difference by using your ARIMA model ?

    Reply
    • Chenny,
      The webpage describes n-step differencing for ARIMA, but I have not yet included seasonality. I have described seasonality for linear regression and for Holt-Winter.
      Charles

      Reply

Leave a Comment