2. ARMA 1 1 This part is based on H and BD. 1
1 MA 1.1 MA(1) Let ε t be WN with variance σ 2 and consider the zero mean 2 process Y t = ε t + θε t 1 (1) where θ is a constant. MA(1). This time series is called first-order moving average process denote by 1.1.1 Moments The expectation of Y t is given by E(Y t ) = E(ε t + θε t 1 ) = E(ε t ) + θe(ε t 1 ) = 0 (2) Clearly with a constant term µ in (10) the expectation would be µ. 2 All we see works even for non zero mean processes. 2
The variance is given by E(Y t ) 2 = E(ε t + θε t 1 ) 2 = E(ε t ) 2 + θ 2 E(ε t 1 ) 2 + 2θE(ε t ε t 1 ) = σ 2 + θ 2 σ 2 = (1 + θ 2 )σ 2 (3) The first autocovariance is E(Y t Y t 1 ) = E(ε t + θε t 1 )(ε t 1 + θε t 2 ) = E(ε t ε t 1 ) + E(θε t ε t 2 ) + E(θε 2 t 1) + E(θ 2 ε t 1 ε t 2 ) = θσ 2 Higher autocovariances are all zero, E(Y t Y t j ) = 0 for j > 1. Since the mean and covariances are not functions of time the process is stationary regardless on the value of θ. Moreover the process is also ergodic for the first moment since j= γ j <. If ε t is also Gaussian then the process is ergodic for all moments. The j-th autocorrelation is for j = 1 and zero for j > 1. ρ 1 = γ 1 γ 0 = θσ 2 (1 + θ 2 )σ = θ 2 (1 + θ 2 ) 3
Figure 1 displays the autocorrelation functions for Y t = ε t + 0.8ε t 1 Figure 1 4
Note that the first order autocorrelation can be plotted as a function of θ as in Figure 2. Figure 2 5
Note that: 1. positive value of θ induce positive autocorrelation, while negative value negative autocorrelations. 2. the largest possible value is 0.5 (θ = 1) and the smallest one is 0.5 (θ = 1) 3. for any value of ρ 1 between [ 0.5, 0.5] there are two values of θ that produce the same autocorrelation because ρ 1 is unchanged if we replace θ with 1/θ 1/θ 1 + 1/θ = θ2 (1/θ) 2 θ 2 (1 + 1/θ 2 ) = θ θ 2 + 1 (4) So the processes Y t = ε t + 0.5ε t 1 and Y t = ε t + 2ε t 1 generate the same autocorrelation functions. 6
1.2 MA(q) A q-th order moving average process, denoted MA(q), is characterized by Y t = ε t + θ 1 ε t 1 + θ 2 ε t 2 +... + θ q ε t q (5) where ε t is WN and the θ i s are any real number. 1.2.1 Moments The mean of (5) is E(Y t ) = E(ε t ) + θ 1 E(ε t 1 ) + θ 2 E(ε t 2 ) +... + θ q E(ε t q ) = 0 The variance of (5) is γ 0 = E(Y 2 t ) = E(ε t + θ 1 ε t 1 + θ 2 ε t 2 +... + θ q ε t q ) 2 = (1 + θ 2 1 + θ 2 2 +... + θ 2 q)σ 2 because all the terms involving the expected value of different ε j s are zero because of the WN assumption. The autocovariances are and zero for j > q. γ j = (θ j + θ j+1 θ 1 + θ j+2 θ 2... + θ q θ q j )σ 2 j = 1, 2,..., q 7
8 Figure 3
Example Consider an MA(2). γ 0 = (1 + θ 2 1 + θ 2 2)σ 2 γ 1 = E(Y t Y t 1 ) = E(ε t ε t 1 ) + E(θ 1 ε t ε t 2 ) + E(θ 2 ε t ε t 3 + E(θ 1 ε 2 t 1) + E(θ 2 1ε t 1 ε t 2 ) + +E(θ 1 θ 2 ε t 1 ε t 3 ) + E(θ 2 ε t 2 ε t 1 ) + E(θ 2 θ 1 ε 2 t 2) + E(θ 2 2ε t 2 ε t 3 ) = (θ 1 + θ 2 θ 1 )σ 2 γ 2 = E(Y t Y t 2 ) = θ 2 E(ε 2 t 2) = θ 2 σ 2 (6) The MA(q) process is covariance stationary for any value of θ i. Moreover the process is also ergodic for the first moment since j= γ j <. If ε t is also Gaussian then the process is ergodic for all moments. 9
1.3 MA( ) The MA( ) can be thought as the limit of a MA(q) process for q Y t = θ j ε t j = θ 0 ε t + θ 1 ε t 1 + θ 2 ε t 2 +... (7) j=0 with absolute summable coefficients j=0 θ j <. If the MA coefficients are square summable (implied by absolute summability), i.e. then the above infinite sum generates a mean square convergent random variable. j=0 θ2 j < 1.3.1 Moments The mean of the process is E(Y t ) = θ j E(ε t j ) = 0 (8) j=0 The variance is γ 0 = E(Y 2 t ) = E( θ j ε t j ) 2 j=0 = E(θ 1 ε t 1 + θ 2 ε t 2 + θ 3 ε t 3 +...) 2 = (θ 2 1 + θ 2 2 + θ 2 3 +...)σ 2 10
= σ 2 θj 2 (9) j=0 Again all the terms involving the expected value of different ε j s are zero because of the WN assumption. Autocovariances are γ j = E(Y t Y t j ) = E(θ 1 ε t 1 + θ 2 ε t 2 + θ 3 ε t 3 +...)(θ 1 ε t j 1 + θ 2 ε t j 2 + θ 3 ε t j 3 +...) = (θ j θ 0 + θ j+1 θ 1 + θ j+2 θ 2 + θ j+3 θ 3 +...)σ 2 The process is stationary finite and constant first and second moments. Moreover an MA( ) with absolutely summable coefficients has absolutely summable autocovariances, j= γ j <, so it is ergodic for the mean. the ε s are Gaussian is ergodic for all moments. 11
1.4 Invertibility and Fundamentalness Invertibility A MA(q) process defined by the equation Y t = θ(l)ε t is said to be invertible is there exists a sequence of constants {π j } j=0 such that j=0 π j < and j=0 π jy t j = ε t. Proposition A MA process defined by the equation Y t = θ(l)ε t is invertible if and only if θ(z) 0 for all z C such that z 1. Fundamentalness A MA Y t = θ(l)ε t is fundamental if and only if θ(z) 0 for all z C such that z < 1. The value of ε t associated with the invertible representation is sometimes called the fundamental innovation for Y t. For the borderline case θ = 1 the process is non invertible but still fundamental. Example: MA(1). Consider Y t = ε t + θε t 1 (10) where ε t is W N(0, σ 2 ). invertibility condition requires θ < 1. If this is the case then (1 θl + θ 2 L 2 θ 3 L 3 +...)Y t = ε t which could be viewed as a AR( ) representation. 12
Problem: Consider the following MA(1) Ỹ t = (1 + θl) ε t (11) where ε t is W N(0, σ 2 ). Moreover suppose that the parameters in this new MA(1) are related to the other as follows: θ = θ 1 σ 2 = θ 2 σ 2 Let us derive the first two moments of the two processes. E(Y t ) = E(Ỹt) = 0. For Y t For Ỹt However note that given the above restrictions E(Y 2 t ) = σ 2 (1 + θ 2 ) E(Y t Y t 1 ) = θσ 2 E(Y 2 t ) = σ 2 (1 + θ 2 ) E(Y t Y t 1 ) = θ σ 2 (1 + θ 2 )σ 2 = = = ( ) 1 + 1 θ2 θ 2 σ 2 ( ) θ2 + 1 θ θ 2 σ 2 2 ( θ2 + 1) σ 2 13
and θσ 2 2 θ 2 σ = θ = θ σ 2 That is the first two moments of the two processes are identical. Note that if θ < 1 then θ > 1. In other words for any invertible MA representation we can find a non invertible representation with identical first and second moments. This can also be seen by showing that the two process share the same autocovariance generating function. Recall the autocovariance generating function is g Y (z) = γ(j)z j j= in the case of (10) the autocovariance generating functions is g Y (z) = θσ 2 z 1 + σ 2 (1 + θ 2 ) + θσ 2 z = σ 2 (θz 1 + 1 + θ 2 + θz while for (11) the autocovariance generating functions is = σ 2 (1 + θz)(1 + θz 1 ) (12) g Y (z) = σ 2 (1 + θz)(1 + θz 1 ) = σ 2 θ(1 + θ 1 z)(1 + θ 1 z 1 ) 14
= σ 2 (θ + z)(θ + z 1 ) = σ 2 (θ 2 + θz + θz 1 + 1) = σ 2 (1 + θz)(1 + θz 1 ) (13) which is identical to the previous one. Now let s see how to recover the invertible representation from the an noninvertible one. Example: MA(2) case: Y t = ε t + θ 1 ε t 1 + θ 2 ε t 2 Its autocovariance generating function can be written as g Y (z) = σ 2 (1 + θ 1 z + θ 2 z 2 )(1 + θ 1 z 1 + θ 2 z 2 ) = σ 2 (1 λ 1 z)(1 λ 2 z)(1 λ 1 z 1 )(1 λ 2 z 1 ) where λ 1 and λ 2 are the reciprocals of the roots of the MA polynomial. Suppose the process has one root bigger than one in absolute value ( λ 2 > 1) and one root smaller than one ( λ 1 < 1). The process is noninvertible. We ask: how can we recover the invertible MA? The answer is Y t = (1 λ 1 L)(1 λ 1 2 L)ε t 15
where the variance of ε t is σ 2 λ 2 2. The autocovariance generating function of such a process is g Y (z) = σ 2 λ 2 2(1 λ 1 z)(1 λ 1 2 z)(1 λ 1z 1 )(1 λ 1 2 z 1 ) = σ 2 (1 λ 1 z)(1 λ 1 z 1 )(λ 2 z)(λ 2 z 1 ) = σ 2 (1 λ 1 z)(1 λ 1 z 1 )(λ 2 2 λ 2 z λ 2 z 1 + 1) = σ 2 (1 λ 1 z)(1 λ 1 z 1 )(1 λ 2 z)(1 λ 2 z 1 ) which is identical to the one of the original process. Summarizing Flip the wrong roots Multiply the original shock by the reciprocal of the wrong root We have seen as an example and MA(2) but this procedure can be extended to general MA(q). 16
General method Consider the MA Y t = θ(l)ε t and suppose there are q s roots less than one in absolute value (other roots larger than one) Let q (1 λ θ(l) 1 j L) = θ(l) (1 λ j )L Let Then ε t = q j=s+1 is fundamental and ε t is BD (Example 3.5.2). j=s+1 (1 λ j L) (1 λ 1 j L) ε t Y t = θ(l)ε t 17
1.5 Wold s decomposition theorem Here is a very powerful result known as Wold d Decomposition theorem. Theorem Any zero-mean covariance stationary process Y t can be represented in the form Y t = θ j ε t j + k t (14) where: 1. θ 0 = 1, 2. j=0 θ2 j <, j=0 3. ε t is the the error made in forecasting Y t on the basis of a linear function of lagged Y t (fundamental innovation), 4. the value of k t is uncorrelated with ε t j for any j and can be perfectly predicted from a linear function of the past values of Y. The term k t is called the linearly deterministic component of Y t. If k t = 0 then the process is called purely non-deterministic. The result is very powerful since holds for any covariance stationary process. 18
However the theorem does not implies that (14) is the true representation of the process. For instance the process could be stationary but non-linear or non-invertible. If the true system is generated by a nonlinear difference equation Y t = g(y t 1,..., Y t 1 ) + η t, obviously, when we fit a linear approximation, as in the Wold theorem, the shock we recover ε will be different from η t. If the model is non invertible then the true shock will not be the Wold shock. 19
2 AR 2.1 AR(1) A first-order autoregression, denoted AR(1), satisfies the following difference equation: where again ε t is a WN. When φ < 1, the solution to (15) is Y t = φy t 1 + ε t (15) Y t = ε t + φε t 1 + φ 2 ε t 2 + φ 3 ε t 3 +... (16) (16) can be viewed as an MA( ) with ψ j = φ j. When φ < 1 the autocovariances are absolutely summable since the MA coefficients are absolutely summable ψ j = φ j 1 = (1 φ ) j=0 j=0 This ensures that the MA representation exists, the process is stationary and ergodic in mean. Recall that j=0 φj is a geometric series converging to 1/(1 φ) if φ < 1. 20
2.2 Moments The mean is E(Y t ) = 0 The variance is γ 0 = E(Y 2 t ) = σ 2 φ 2j = j=0 σ2 1 φ 2 The jth autocovariance is The jth autocorrelation is γ j = E(Y t Y t j ) = φj σ 2 ρ j = φ j 1 φ 2 21
To find the moments of the AR(1) we can use a different strategy by directly working with the AR representation and the assumption of stationarity. Note that the mean E(Y t ) = φe(y t 1 ) + E(ε t ) given the stationarity assumption E(Y t ) = E(Y t 1 ) and therefore (1 φ)e(y t ) = 0. The jth autocovariance is γ j = E(Y t Y t j ) = φe(y t 1 Y t j ) + E(ε t Y t j ) = φγ j 1 = φ j γ 0 Similarly the variance E(Y 2 t ) = φe(y t 1 Y t ) + E(ε t Y t ) = φe(y t 1 Y t ) + σ 2 γ 0 = φ 2 γ 0 + σ 2 = σ 2 /(1 φ 2 ) 22
Figure 4 23
24 Figure 5
2.3 AR(2) A second-order autoregression, denoted AR(2), satisfies where again ε t is a WN. Using the lag operator Y t = φ 1 Y t 1 + φ 2 Y t 2 + ε t (17) (1 φ 1 L φ 2 L 2 )Y t = ε t (18) The difference equation is stable provided that the roots of the polynomial 1 φ 1 z φ 2 z 2 lie outside the unit circle. When this condition is satisfied the process is covariance stationary and the inverse of the autoregressive operator is with j=0 ψ j <. ψ(l) = (1 φ 1 L φ 2 L 2 ) 1 = ψ 0 + ψ 1 L + ψ 2 L 2 + ψ 3 L 3 +... 25
2.3.1 Moments To find the moments of the AR(2) we can proceed as before. The mean is E(Y t ) = φ 1 E(Y t 1 ) + φ 2 E(Y t 2 ) + E(ε t ) again by stationarity E(Y t ) = E(Y t j ) and therefore (1 φ 1 φ 2 )E(Y t ) = 0. The jth autocovariance is given by γ j = E(Y t Y t j ) = φ 1 E(Y t 1 Y t j ) + φ 2 E(Y t 2 Y t j ) + E(ε t Y t j ) = φ 1 γ j 1 + φ 2 γ j 2 Similarly the jth autocorrelation is ρ j = φ 1 ρ j 1 + φ 2 ρ j 2 In particular setting j = 1 ρ 1 = φ 1 + φ 2 ρ 1 ρ 1 = φ 1 /(1 φ 2 ) and setting j = 2, ρ 2 = φ 1 ρ 1 + φ 2 26
ρ 2 = φ 1 φ 1 1 φ 2 + φ 2 The variance The equation can be written as where the last term comes from the fact that E(Y 2 t ) = φ 1 E(Y t 1 Y t ) + φ 2 E(Y t 2 Y t ) + E(ε t Y t ) γ 0 = φ 1 γ 1 + φ 2 γ 2 + σ 2 E(ε t Y t ) = φ 1 E(ε t Y t 1 ) + φ 2 E(ε t Y t 2 ) + E(ε 2 t) and that E(ε t Y t 1 ) = E(ε t Y t 2 ) = 0. Note that γ j /γ 0 = ρ j. Therefore the variance can be rewritten as γ 0 = φρ 1 γ 0 + φ 2 ρ 2 γ 0 + σ 2 = (φ 1 ρ 1 + φ 2 ρ 2 )γ 0 + σ [ 2 φ 2 1 = (1 φ 2 ) + φ 2φ 2 ] 1 (1 φ 2 ) + φ2 2 γ 0 + σ 2 (1 φ 2 )σ 2 γ 0 = (1 + φ 2 )[(1 φ 2 ) 2 φ 2 1 ] (19) 27
Figure 4 28
2.4 AR(p) A p-order autoregression, denoted AR(2), satisfies where again ε t is a WN. Using the lag operator Y t = φ 1 Y t 1 + φ 2 Y t 2 +... + φ p Y t p ε t (20) (1 φ 1 L φ 2 L 2... φ p Y t p )Y t = ε t (21) The difference equation is stable provided that the roots of the polynomial 1 φ 1 z φ 2 z 2... φ p z p lie outside the unit circle. When this condition is satisfied the process is covariance stationary and the inverse of the autoregressive operator is with j=0 ψ j <. ψ(l) = (1 φ 1 L φ 2 L 2... φ p L p ) 1 = ψ 0 + ψ 1 L + ψ 2 L 2 + ψ 3 L 3 +... 29
2.4.1 Moments E(Y t ) = 0; The jth autocovariance is γ j = φ 1 γ j 1 + φ 2 γ j 2 +... + φ p γ j p for j = 1, 2,... The variance is γ 0 = φ 1 γ 1 + φ 2 γ 2 +... + φ p γ p Dividing by γ 0 the autocovariances one obtains the Yule-Walker equations ρ j = φ 1 ρ j 1 + φ 2 ρ j 2 +... + φ p ρ j p 30
2.4.2 Finding the roots of (1 φ 1 z φ 2 z 2... φ p z p ) An easy way to find the roots of the polynomial is the following. Define two new vectors Z t = [Y t, Y t 1,..., Y t p+1 ], t = [ε t, 0 (1 p 1) ] and a new matrix φ 1 φ 2 φ 3 φ p 1 0 0 0 F = 0 1 0. 0..... 0 0 1 0 Then Z t satisfies the AR(1) Z t = F Z t 1 + ɛ t The roots of the polynomial (1 φ 1 z φ 2 z 2... φ p z p ) coincide with the reciprocal of the eigenvalues of F. 31
2.5 Causality and stationarity Causality An AR(p) process defined by the equation (1 φ 1 L φ 2 L 2... φ p L p )Y t = φ(l)y t = ε t is said to be causal if there exists a sequence of constants {ψ j } j=0 such that j=0 ψ j < and Y t = j=0 ψ jε t j. Proposition An AR process φ(l)y t = ε t is causal if and only if φ(z) 0 for all z such that z 1. Stationarity The AR(p) is stationary if and only if φ(z) 0 for all the z such that z = 1 Here we focus on AR processes which are causal and stationary. Example Consider the process Y t = φy t 1 + ε t where ε t is WN and φ > 1. Clearly the process is not causal. However we can rewrite the process as Y t = 1 φ Y t+1 1 φ ε t+1 or using the forward operator F = L 1 Y t = 1 φ F Y t 1 φ F ε t (1 1 φ F )Y t = 1 φ F ε t 32
Y t = (1 1 φ F ) 1 1 φ F ε t which is a mean square convergent random variable. Using the lag operator it is easy to see that (1 φl) = ((φl) 1 1)(φL) = (1 (φl) 1 )( φl) (22) 33
3 ARMA 3.1 ARMA(p,q) An ARMA(p,q) process includes both autoregressive and moving average terms: Y t = φ 1 Y t 1 + φ 2 Y t 2 +... + φ p Y t p + ε t + θ 1 ε t 1 +... + θ q ε t q (23) where again ε t is a WN. Using the lag operator (1 φ 1 L φ 2 L 2... φ p L p )Y t = (1 + θ 1 L + θ 2 1L 2 +... + θ p L p )ε t (24) Provided that the roots of the polynomial 1 φ 1 z φ 2 z 2... φ p z p lie outside the unit circle the ARMA process can bee written as where Y t = ψ(l)ε t ψ(l) = (1 + θ 1L + θ 2 1L 2 +... + θ p L p ) (1 φ 1 L φ 2 L 2... φ p L p ) Stationarity of the ARMA process depends on the AR parameters. Again a stationarity is implied by the roots being outside the unit circle in absolute value. 34
The variance of the process is and the j autocovariance is for j q E(Y 2 t ) = φ 1 E(Y t 1 Y t ) + φ 2 E(Y t 2 Y t ) +... + φ p E(Y t p Y t ) + +E(ε t Y t ) + θ 1 E(ε t 1 Y t ) +... + θ q E(ε t q Y t ) = φ 1 [σ 2 (ψ 1 ψ 0 + ψ 2 ψ 1 +...)] + φ 2 [σ 2 (ψ 2 ψ 0 + ψ 3 ψ 1 +...)] +... + φ p [σ 2 (ψ p ψ 0 + ψ p+1 2ψ 1 +...)] + +E(ψ 0 ε 2 t) + θ 1 E(ψ 1 ε 2 t) +... + θ q E(ψ q ε 2 t) = φ 1 [σ 2 (ψ 1 ψ 0 + ψ 2 ψ 1 +...)] + φ 2 [σ 2 (ψ 2 ψ 0 + ψ 3 ψ 1 +...)] +... + φ p [σ 2 (ψ p ψ 0 + ψ p+1 2ψ 1 +...)] + +σ 2 (ψ 0 + θ 1 ψ 1 +... + θ q ψ q ) γ j = E(Y t Y t j ) = φ 1 E(Y t 1 Y t j ) + φ 2 E(Y t 2 Y t j ) +... + φ p E(Y t p Y t j ) + while for j > q autocovariances are +E(ε t Y t j ) + θ 1 E(ε t 1 Y t j ) +... + θ q E(ε t q Y t j ) γ j = E(Y t Y t j ) = φ 1 γ j 1 + φ 2 γ j 2 +... + φ p γ j p + +σ 2 (θ j ψ 0 + θ j+1 ψ 1...) γ j = E(Y t Y t j ) = φ 1 γ j 1 + φ 2 γ j 2 +... + φ p γ j p + 35
There is a potential problem for redundant parametrization with ARMA processes. Consider a simple WN Y t = ε t and multiply both sides by (1 ρl) to get (1 ρl)y t = (1 ρl)ε t an ARMA(1,1) with θ = φ = ρ. Both representations are valid however it is important to avoid such parametrization since we would get into trouble for estimating the parameter. 36
3.2 ARMA(1,1) The ARMA(1,1) satisfies where again ε t is a WN, and Here we have that (1 φl)y t = (1 + θl)ε t (25) ψ(l) = (1 + θl) (1 φl) γ 0 = φe(y t 1 Y t ) + E(ε t Y t ) + (ε t 1 Y t ) γ 0 = φσ 2 (ψ 1 ψ 0 + ψ 2 ψ 1 +...) + σ 2 + θψ 1 σ 2 γ 1 = φe(yt 1) 2 + E(ε t Y t 1 ) + E(θε t 1 Y t 1 ) = φγ 0 + θσ 2 γ 2 = φγ 1 (26) 37