5. Partial Autocorrelation Function of MA(1) Process:

54 5. Partial Autocorrelation Function of MA() Process: φ, = ρ() = θ + θ 2 0 ( ρ() ) ( φ2, ) ( φ() ) = ρ() φ 2,2 φ(2) ρ() ρ() ρ(2) = φ 2,2 = ρ() = ρ() ρ() ρ() 0 ρ() ρ() = ρ()2 ρ() 2 = θ 2 + θ 2 + θ4 0

55 ρ() ρ(2) φ 3, ρ() ρ() φ 3,2 = φ(2) ρ(2) ρ() φ 3,3 φ(3) ρ() ρ() ρ() ρ(2) ρ(2) ρ() ρ(3) = φ 3,3 = ρ() ρ(2) = ρ() ρ() ρ(2) ρ() φ() ρ() ρ() ρ() 0 0 ρ() 0 ρ() 0 ρ() ρ() 0 ρ() = ρ() 3 2ρ() 2 0

56 ρ() ρ(2) ρ(3) φ 4, ρ() ρ() ρ(2) φ 4,2 φ(2) = ρ(2) ρ() ρ() φ 4,3 φ(3) ρ(3) ρ(2) ρ() φ 4,4 φ(4) ρ() ρ(2) ρ() ρ() ρ() ρ(2) ρ(2) ρ() ρ(3) ρ(3) ρ(2) ρ() ρ(4) = φ 4,4 = ρ() ρ(2) ρ(3) = ρ() ρ() ρ(2) ρ(2) ρ() ρ() ρ(3) ρ(2) ρ() As a result, φ k,k 0 for all k =, 2, φ() ρ() 0 ρ() ρ() ρ() 0 0 ρ() 0 0 0 ρ() 0 ρ() 0 0 ρ() ρ() 0 0 ρ() ρ() 0 0 ρ() 0

57 6. Likelihood Function of MA() Process: The autocovariance functions are: γ(0) = (+θ 2 )σ2 ɛ, γ() = θ σ 2 ɛ, and γ(τ) = 0 for τ = 2, 3,. The joint distribution of y, y 2,, y T is: ( f (y, y 2,, y T ) = (2π) T/2 V /2 exp ) 2 Y V Y where Y = y y 2. y T, V = σ 2 ɛ + θ 2 θ 0 0 θ + θ 2 θ.... 0 θ...... 0....... + θ 2 θ 0 0 θ + θ 2.

58 7. MA() +drift: y t = µ + ɛ t + θ ɛ t Mean of MA() Process: y t = µ + θ(l)ɛ t, where θ(l) = + θ L. Taking the expectation, E(y t ) = µ + θ(l)e(ɛ t ) = µ.

59 Example: MA(2) Model: y t = ɛ t + θ ɛ t + θ 2 ɛ t 2. Autocovariance Function of MA(2) Process: ( + θ 2 + θ2 2 )σ2 ɛ, for τ = 0, (θ + θ θ 2 )σ 2 ɛ, for τ =, γ(τ) = θ 2 σ 2 ɛ, for τ = 2, 0, otherwise. 2. let /β and /β 2 be two solutions of x from θ(x) = 0. For invertibility condition, both β and β 2 should be less than one in absolute value. Then, the MA(2) model is represented as: y t = ɛ t + θ ɛ t + θ 2 ɛ t 2

60 = ( + θ L + θ 2 L 2 )ɛ t = ( + β L)( + β 2 L)ɛ t AR( ) representation of the MA(2) model is given by: ɛ t = ( + β L)( + β 2 L) y t ( β /(β β 2 ) = + β L + β 2/(β β 2 ) + β 2 L ) y t

6 Y = 3. Likelihood Function: y y 2. y T where, V = σ 2 ɛ f (y, y 2,, y T ) = ( (2π) T/2 V /2 exp ) 2 Y V Y + θ 2 + θ2 2 θ + θ θ 2 θ 2 0 θ + θ θ 2 + θ 2 + θ2 2 θ + θ θ 2... θ 2 θ + θ θ 2...... θ 2...... + θ 2 + θ2 2 θ + θ θ 2 0 θ 2 θ + θ θ 2 + θ 2 + θ2 2

62 4. MA(2) +drift: y t = µ + ɛ t + θ ɛ t + θ 2 ɛ t 2 Mean: y t = µ + θ(l)ɛ t, where θ(l) = + θ L + θ 2 L 2. Therefore, E(y t ) = µ + θ(l)e(ɛ t ) = µ

63 Example: MA(q) Model: y t = ɛ t + θ ɛ t + θ 2 ɛ t 2 + + θ q ɛ t q. Mean of MA(q) Process: E(y t ) = E(ɛ t + θ ɛ t + θ 2 ɛ t 2 + + θ q ɛ t q ) = 0 2. Autocovariance Function of MA(q) Process: q τ σ 2 ɛ(θ 0 θ τ + θ θ τ+ + + θ q τ θ q ) = σ 2 ɛ θ i θ τ+i, τ =, 2,, q, γ(τ) = i=0 0, τ = q +, q + 2,, where θ 0 =. 3. MA( q) process is stationary. 4. MA(q) +drift: y t = µ + ɛ t + θ ɛ t + θ 2 ɛ t 2 + + θ q ɛ t q

64 Mean: y t = µ + θ(l)ɛ t, where θ(l) = + θ L + θ 2 L 2 + + θ q L q. Therefore, we have: E(y t ) = µ + θ(l)e(ɛ t ) = µ.

65 5.4 ARMA Model ARMA (Autoregressive Moving Average ) Process. ARMA(p, q) y t = φ y t + φ 2 y t 2 + + φ p y t p + ɛ t + θ ɛ t + θ 2 ɛ t 2 + + θ q ɛ t q, which is rewritten as: φ(l)y t = θ(l)ɛ t, where φ(l) = φ L φ 2 L 2 φ p L p and θ(l) = +θ L+θ 2 L 2 + +θ q L q. 2. Likelihood Function: The variance-covariance matrix of Y, denoted by V, has to be computed.

66 Example: ARMA(,) Process: y t = φ y t + ɛ t + θ ɛ t Obtain the autocorrelation coefficient. The mean of y t is to take the expectation on both sides. E(y t ) = φ E(y t ) + E(ɛ t ) + θ E(ɛ t ), where the second and third terms are zeros. Therefore, we obtain: E(y t ) = 0. The autocovariance of y t is to take the expectation, multiplying y t τ on both sides. E(y t y t τ ) = φ E(y t y t τ ) + E(ɛ t y t τ ) + θ E(ɛ t y t τ ). Each term is given by: E(y t y t τ ) = γ(τ), E(y t y t τ ) = γ(τ ),

67 σ 2 ɛ, τ = 0, E(ɛ t y t τ ) = 0, τ =, 2,, Therefore, we obtain; (φ + θ )σ 2 ɛ, τ = 0, E(ɛ t y t τ ) = σ 2 ɛ, τ =, 0, τ = 2, 3,. γ(0) = φ γ() + ( + φ θ + θ 2 )σ2 ɛ, γ() = φ γ(0) + θ σ 2 ɛ, γ(τ) = φ γ(τ ), τ = 2, 3,. From the first two equations, γ(0) and γ() are computed by: ( φ ) ( γ(0) ) ( + φ θ + θ 2 ) φ γ() = σ 2 ɛ θ ( γ(0) ) ( ) φ ( + φ θ + θ 2 ) γ() = σ 2 ɛ φ θ

68 = σ2 ɛ φ 2 ( φ ) ( + φ θ + θ 2 ) φ θ ( = σ2 ɛ φ 2 + 2φ θ + θ 2 ( + φ θ )(φ + θ ) ). Thus, the initial value of the autocorrelation coefficient is given by: ρ() = ( + φ θ )(φ + θ ). + 2φ θ + θ 2 We have: ρ(τ) = φ ρ(τ ).

69 ARMA(p, q) +drift: y t = µ + φ y t + φ 2 y t 2 + φ p y t p + ɛ t + θ ɛ t + θ 2 ɛ t 2 + + θ q ɛ t q. Mean of ARMA(p, q) Process: φ(l)y t = µ + θ(l)ɛ t, where φ(l) = φ L φ 2 L 2 φ p L p and θ(l) = + θ L + θ 2 L 2 + + θ q L q. y t = φ(l) µ + φ(l) θ(l)ɛ t. Therefore, E(y t ) = φ(l) µ + φ(l) θ(l)e(ɛ t ) = φ() µ = µ φ φ 2 φ p.

70 5.5 ARIMA Model Autoregressive Integrated Moving Average (ARIMA ) Model ARIMA(p, d, q) Process φ(l) d y t = θ(l)ɛ t, where d y t = d ( L)y t = d y t d y t = ( L) d y t for d =, 2,, and 0 y t = y t.

7 5.6 SARIMA Model Seasonal ARIMA (SARIMA) Process:. SARIMA(p, d, q) φ(l) d s y t = θ(l)ɛ t, where s y t = ( L s )y t = y t y t s. s = 4 when y t denotes quarterly date and s = 2 when y t represents monthly data. 5.7 Optimal Prediction. AR(p) Process: y t = φ y t + + φ p y t p + ɛ t

72 (a) Define: E(y t+k Y t ) = y t+k t, where Y t denotes all the information available at time t. Taking the conditional expectation of y t+k = φ y t+k + +φ p y t+k p +ɛ t+k on both sides, y t+k t = φ y t+k t + + φ p y t+k p t, where y s t = y s for s t. (b) Optimal prediction is given by solving the above differential equation. 2. MA(q) Process: y t = ɛ t + θ ɛ t + + θ q ɛ t q (a) Let ˆɛ T, ˆɛ T,, ˆɛ be the estimated errors. (b) y t+k = ɛ t+k + θ ɛ t+k + + θ q ɛ t+k q

73 (c) Therefore, y t+k t = ɛ t+k t + θ ɛ t+k t + + θ q ɛ t+k q t, where ɛ s t = 0 for s > t and ɛ s t = ˆɛ s for s t. 3. ARMA(p, q) Process: y t = φ y t + + φ p y t p + ɛ t + θ ɛ t + + θ q ɛ t q (a) y t+k = φ y t+k + + φ p y t+k p + ɛ t+k + θ ɛ t+k + + θ q ɛ t+k q (b) Optimal prediction is: y t+k t = φ y t+k t + + φ p y t+k p t + ɛ t+k t + θ ɛ t+k t + + θ q ɛ t+k q t, where y s t = y s and ɛ s t = ˆɛ s for s t, and ɛ s t = 0 for s > t.

74 5.8 Identification. Based on AIC or SBIC given d, s, we obtain p, q. We choose p and q, where AIC or SBIC is minimized. (a) AIC (Akaike s Information Criterion) AIC = 2 log(likelihood) + 2k, where k = p + q, which is the number of parameters estimated. (b) SBIC (Shwarz s Bayesian Information Criterion) SBIC = 2 log(likelihood) + k log T, where T denotes the number of observations. 2. From the sample autocorrelation coefficient function ˆρ(k) and the partial autocorrelation coefficient function ˆφ k,k for k =, 2,, we obtain p, d, q, s.

75 AR(p) Process MA(q) Process Autocorrelation Function Gradually decreasing ρ(k) = 0, k = q +, q + 2, Partial Autocorrelation Function φ(k, k) = 0, Gradually decreasing k = p +, p + 2, (a) Compute s y t to remove seasonality. Compute the autocovariance functions of s y t. If the autocovariance functions have period s, we take ( L s ), again. (b) Determine the order of difference. Compute the partial autocovariance functions every time. If the autocovariance functions decrease as τ is large, go to the next step. (c) Determine the order of AR terms (i.e., p).

76 Compute the partial autocovariance functions every time. The partial autocovariance functions are close to zero after some τ, go to the next step. (d) Determine the order of MA terms (i.e., q). Compute the autocovariance functions every time. If the autocovariance functions are randomly around zero, end of the procedure.

77 5.9 Example of SARIMA using Consumption Data Construct SARIMA model using monthly and seasonally unadjusted consumption expenditure data and STATA2. Estimation Period: Jan., 970 Dec., 202 (T = 56). gen time=_n. tsset time time variable: time, to 56 delta: unit. corrgram expend - 0-0 LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor] ------------------------------------------------------------------------------- 0.8488 0.8499 373.88 0.0000 ------ ------ 2 0.823 0.3858 726.8 0.0000 ------ --- 3 0.876 0.5266 22 0.0000 ------ ---- 4 0.8706 0.4025 57.6 0.0000 ------ --- 5 0.8498 0.3447 895.3 0.0000 ------ -- 6 0.8085 0.0074 2237.9 0.0000 ------ 7 0.8378 0.528 2606.5 0.0000 ------ -

78 8 0.8460 0.467 2983 0.0000 ------ - 9 0.8342 0.3006 3349.9 0.0000 ------ -- 0 0.7735-0.58 3666 0.0000 ------ - 0.7852-0.85 3992.3 0.0000 ------ 2 0.9234 0.9442 4444.5 0.0000 ------- ------- 3 0.7754-0.5486 4764. 0.0000 ------ ---- 4 0.7482-0.3248 5062. 0.0000 ----- -- 5 0.7963-0.2392 5400.5 0.0000 ------ -. gen dexp=expend-l.expend ( missing value generated). corrgram dexp - 0-0 LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor] ------------------------------------------------------------------------------- -0.436-0.4329 96.485 0.0000 --- --- 2-0.2546-0.544 30.3 0.0000 -- ---- 3 0.72-0.409 45.53 0.0000 - --- 4 0.0667-0.3459 47.85 0.0000 -- 5 0.075-0.0036 50.52 0.0000 6-0.2428-0.489 8.36 0.0000 - - 7 0.07-0.400 84.0 0.0000-8 0.0668-0.2900 86.36 0.0000 -- 9 0.704 0.68 20.64 0.0000 - - 0-0.2485 0.306 234.2 0.0000 - - -0.4293-0.9305 33.56 0.0000 --- ------- 2 0.9773 0.6768 837.2 0.0000 ------- ----- 3-0.452 0.3778 928.56 0.0000 --- --- 4-0.2583 0.2688 964.03 0.0000 -- -- 5 0.72 0.0406 979.63 0.0000 -

79. gen sdex=dexp-l2.dexp (3 missing values generated). corrgram sdex - 0-0 LAG AC PAC Q Prob>Q [Autocorrelation] [Partial Autocor] ------------------------------------------------------------------------------- -0.4752-0.4753 4.28 0.0000 --- --- 2-0.0244-0.3235 4.58 0.0000 -- 3 0.63-0.0759 2.46 0.0000 4-0.246-0.365 29.37 0.0000-5 0.034-0.06 29.96 0.0000 6-0.05-0.36 30.08 0.0000 7-0.0395-0.43 30.88 0.0000-8 0.23 0.0092 37.35 0.0000 9-0.0664-0.000 39.62 0.0000 0 0.068 0.0069 39.76 0.0000 0.642 0.2422 53.68 0.0000 - - 2-0.3888-0.2469 23.9 0.0000 --- - 3 0.2242-0.205 257.96 0.0000-4 -0.047-0.094 258.07 0.0000 5-0.0708-0.059 260.68 0.0000. arima sdex, ar(,2) ma() (setting optimization to BHHH) Iteration 0: log likelihood = -507.4608 Iteration : log likelihood = -502.39

80 Iteration 2: log likelihood = -5099.907 Iteration 3: log likelihood = -5099.426 Iteration 4: log likelihood = -5099.2463 (switching optimization to BFGS) Iteration 5: log likelihood = -5099.236 Iteration 6: log likelihood = -5099.2346 Iteration 7: log likelihood = -5099.2346 Iteration 8: log likelihood = -5099.2346 ARIMA regression Sample: 4-56 Number of obs = 503 Wald chi2(3) = 973.93 Log likelihood = -5099.235 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ OPG sdex Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- sdex _cons -5.64573 59.7574-0.26 0.79-3.628 00.3366 -------------+---------------------------------------------------------------- ARMA ar L..27774.058883 2.9 0.029.03304.242244 L2..009983.053626.88 0.060 -.004068.206034 ma L. -.8343264.049364-9.90 0.000 -.965202 -.752326 -------------+---------------------------------------------------------------- /sigma 6.28 39.005 43.96 0.000 5838.673 6383.584 ------------------------------------------------------------------------------

8 Note: The test of the variance against zero is one sided, and the two-sided confidence interval is truncated at zero.. estat ic ----------------------------------------------------------------------------- Model Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------. 503. -5099.235 5 0208.47 0229.57 ----------------------------------------------------------------------------- Note: N=Obs used in calculating BIC; see [R] BIC note

82 6 Unit Root ( ) and Cointegration ( ) 6. Unit Root ( ) Test (Dickey-Fuller (DF) Test). Why is a unit root problem important? (a) Economic variables increase over time in general. One of the assumptions of OLS is stationarity on y t and x t. This assumption implies that T X X converges to a fixed matrix as T is large. That is, asymptotic normality of OLS estimator does not hold. (b) In nonstationary time series, the unit root is the most important. In the case of unit root, OLSE of the first-order autoregressive coefficient is consistent.

83 OLSE is T-consistent in the case of stationary AR() process, but OLSE is T-consistent in the case of nonstationay AR() process. (c) A lot of economic variables increase over time. It is important to check an economic variable is trend stationary (i.e., y t = a 0 + a t + ɛ t ) or difference stationary (i.e., y t = b 0 + y t + ɛ t ). Consider k-step ahead prediction for both cases. (Trend Stationarity) y t+k t = a 0 + a (t + k) (Difference Stationarity) y t+k t = b 0 k + y t 2. The Case of φ < : y t = φ y t + ɛ t, ɛ t i.i.d. N(0, σ 2 ɛ), y 0 = 0, t =,, T

84 Then, OLSE of φ is: ˆφ = y t y t. y 2 t In the case of φ <, ˆφ = φ + T T y t ɛ t y 2 t φ + E(y t ɛ t ) E(y 2 t ) = φ. Note as follows: T y t ɛ t E(y t ɛ t ) = 0.

85 By the central limit theorem, yɛ E(yɛ) V(yɛ) N(0, ) where yɛ = T y t ɛ t.

86 Therefore, E(yɛ) = 0, V(yɛ) = V( T y t ɛ t ) = E ( ( T = T 2 E( T y t ɛ t ) 2) ) T y t y s ɛ t ɛ s = T E( 2 s= y 2 t ɛ2 t ) = T σ2 ɛγ(0). yɛ σ 2 ɛ γ(0)/t = T σ ɛ γ(0) y t ɛ t N(0, ), which is rewritten as: T y t ɛ t N(0, σ 2 ɛγ(0)).

87 E(y2 t ) = γ(0), we have the following asymptotic distri- Using T bution: y 2 t T( ˆφ φ ) = T T y t ɛ t y 2 t N ( ) σ 2 ɛ 0, = N ( 0, φ 2 γ(0) ). Note that γ(0) = σ2 ɛ. φ 2 3. In the case of φ =, as expected, we have: T( ˆφ ) 0. That is, ˆφ has the distribution which converges in probability to φ = (i.e., degenerated distribution).

88 Is this true? 4. The Case of φ = : = Random Walk Process y t = y t + ɛ t with y 0 = 0 is written as: y t = ɛ t + ɛ t + ɛ t 2 + + ɛ. Therefore, we can obtain: y t N(0, σ 2 ɛt). The variance of y t depends on time t. = y t is nonstationary. yt ɛ t 5. Remember that ˆφ = φ +. y 2 t (a) First, consider the numerator y t ɛ t. We have y 2 t = (y t + ɛ t ) 2 = y 2 t + 2y t ɛ t + ɛ 2 t.

89 Therefore, we obtain: y t ɛ t = 2 (y2 t y 2 t ɛ2 t ). Taking into account y 0 = 0, we have: y t ɛ t = 2 y2 T 2 ɛt 2. Divided by σ 2 ɛt on both sides, we have the following: σ 2 ɛt y t ɛ t = ( ) 2 y T 2 σ ɛ T 2σ 2 ɛ From y t N(0, σ 2 ɛt), we obtain the following result: ( y T σ ɛ T ) 2 χ 2 (). T ɛt 2.

90 Moreover, the second term is derived from: T ɛt 2 E(ɛt 2 ) = σ 2 ɛ. Therefore, σ 2 ɛt y t ɛ t = ( ) 2 y T 2 σ ɛ T 2σ 2 ɛ T ɛt 2 2 (χ2 () ). (b) Next, consider y 2 t. E y 2 t = E(y 2 t ) = σ 2 ɛ(t ) = σ 2 ɛ Thus, we obtain the following result: T E 2 y 2 t a fixed value, i.e., σ2 ɛ 2. T(T ). 2

9 Therefore, T 2 y 2 t a distribution. 6. Summarizing the results up to now, T( ˆφ φ ), not T( ˆφ φ ), has limiting distribution in the case of φ =. T( ˆφ φ ) = (/T) y t ɛ t (/T 2 ) y 2 t a distribution. 7. Basic Concepts of Random Walk Process: (a) Model: y t = y t + ɛ t, y 0 = 0, ɛ t N(0, ). Then, y t = ɛ t + ɛ t + + ɛ. Therefore, y t N(0, t).

92 = Nonstationary Process (i.e., variance depends on time t.) Difference between y s and y t (s > t) is: y s y t = ɛ s + ɛ s + + ɛ t+2 + ɛ t+. The distribution of y s y t is: y s y t N(0, s t). (b) Rewrite as follows: y t = y t + ɛ t = y t + e,t + e 2,t + + e N,t, where ɛ t = e,t + e 2,t + + e N,t. e,t, e 2,t,, e N,t are iid with e i,t N(0, /N). That is, suppose that there are N subperiods between time t and time t+.