Autoregressive Processes Basic Concepts

In a simple linear regression model, the predicted dependent variable is modeled as a linear function of the independent variable plus a random error term.

image048z

A first-order autoregressive process, denoted AR(1), takes the form

image049z

Thinking of the subscripts i as representing time, we see that the value of y at time i+1 is a linear function of y at time i plus a fixed constant and a random error term. Similar to the ordinary linear regression model, we assume that the error terms are independently distributed based on a normal distribution with zero mean and a constant variance σ2 and that the error terms are independent of the y values. Thus

image038z

image050z

It turns out that such a process is stationary when |φ1| < 1, and so we will make this assumption as well. Note that if |φ1| = 1 we have a random walk.

Similarly, a second-order autoregressive process, denoted AR(2), takes the form

image051z

and a p-order autoregressive process, AR(p), takes the form

image052z

Property 1: The mean of the yi in a stationary AR(p) process is

image053z

Proofclick here

Property 2: The variance of the yi in a stationary AR(1) process is

image054z

Proofclick here

Property 3: The lag h autocorrelation in a stationary AR(1) process is

image058zProofclick here

Example 1: Simulate a sample of 100 elements from the AR(1) process

image062z

where εiN(0,1) and calculate ACF.

Thus φ0 = 5, φ1 = .4 and σ = 1. We simulate the independent εi by using the Excel formula =NORM.INV(RAND(),0,1) or =NORM.S.INV(RAND()) in column B of Figure 1 (only the first 20 of 100 values are displayed.

The value of y1 is calculated by placing the formula =5+0.4*0+B4 in cell C4 (i.e. we arbitrarily assign the value zero to y0). The other yi values are calculated by placing the formula =5 +0.4*C4+B5 in cell C5, highlighting the range C5:C203 and pressing Ctrl-D.

By Property 1 and 2, the theoretical values for the mean and variance are μ = φ0/(1–φ1) = 5/(1–.4) = 8.33 (cell F22) and

image063z

(cell F23). These compare to the actual time series values of ȳ =AVERAGE(C4:C103) = 8.23 (cell I22) and s2 = VAR.S(C4:C103) = 1.70 (cell I23).

The time series ACF values are shown for lags 1 through 15 in column F. These are calculated from the y values as in Example 1. Note that the ACF value at lag 1 is .394376. Based on Property 3, the population ACF value at lag 1 is ρ1 = φ1 = .4. Theoretically, the values for ρh = \phi_1^h = .4h should get smaller and smaller as h increases (as shown in column G of Figure 1).

Simulated AR(1) process

Figure 1 – Simulated AR(1) process

The graph of the y values is shown on the left of Figure 2. As you can see, no particular pattern is visible. The graph of ACF for the first 15 lags is shown on the right side of Figure 2. As you can see, the actual and theoretical values for the first two lags agree, but after that, the ACF values are small but not particularly consistent.

Chart ACF AR(1) process

Figure 2 – Graphs of simulated AR(1) process and ACF

Observation: Based on Property 3, for 0 < φ1 < 1, the theoretical values of ACF converge to 0. If φ1 is negative, -1 < φ1 < 0, then the theoretical values of ACF also converge to 0, but alternate in sign between positive and negative.

Property 4 : For any stationary AR(p) process. The autocovariance at lag k > 0 can be calculated as

image065z

Similarly the autocorrelation at lag k > 0 can be calculated as

image066z

Here we assume that γh = γ-h and ρh = ρ-h if h < 0, and ρ0 = 1.

These are known as the Yule-Walker equations.

Proofclick here

Property 5: The Yule-Walker equations also hold where k = 0 provided we add a σ2 term to the sum. This is equivalent to

image069z

Observation: In the AR(1) case, we have

image070z

image071z

image072z

image073zimage074z

andimage075z

Solving for γ0 yieldsimage076z

In the AR(2) case, we haveimage070z

image077z

Solving for ρ1 yieldsimage078z

Alsoimage079z

We can also calculate the variance as follows:

image080z

Solving for γ0 yieldsimage081z

image082z

This value can be re-expressed algebraically as described in Property 7 below.

Property 6: The following hold for a stationary AR(2) process

image070z

image078z

image083z

Proof: Follows from Property 4, as shown above.

Property 7: The variance of the yi in a stationary AR(2) process is

image084z

Proofclick here for an alternative proof.

16 thoughts on “Autoregressive Processes Basic Concepts”

  1. Hi Charles,

    Do you have any explanation on why the theoretical variance is lower than the sample (actual) variance ? By Property 1 I can understand why the theoretical mean of the AR (1) gets bigger when phi_1 increases. I cannot draw a similar conclusion for the theoretical variance from Property 2 (one would actually an increase in the theoritical variance as ph_1 increases in absolute value).

    Reply
    • Dear Ibrahima,
      The formula shows that var(y_i) does increase as phi_1 increases from 0 to 1. Say sigma_1 = 1, then when phi_1 = .5 then var(y_i) = 1.3333, but when phi_i = .6 then var(y_i) = 1.5625.
      Charles

      Reply
  2. I have data that can be described by the model Arima 010. I am not being able to calculate the late and there is no constant in the equation. Please tell me how to go about the same

    Reply
  3. This is very clear and very helpful!

    To assess the effect of a chronic medical condition from which I suffer, each day I give myself a score out of 10, and have been doing so every day for over six years. Though I have a degree in mathematics with a fair proportion of statistics (and a PhD in statistics, in a very specialised area), I know little about time series, but your pages and Excel functions are really helping me to make a start on analysing my data. The next step for me is to formulate a model taking account of the discrete nature of my data.

    Keep up the excellent work!

    Reply
  4. Can the method of maximum likelihood estimation be applied to the estimation of the phi parameter using the yule-walker equation?

    Reply

Leave a Comment