LECTURE 13: TIME SERIES I

LECTURE 13: TIME SERIES I AUTOCORRELATION: Consider y = Xβ + u where y is Tx1, X is TxK, β is Kx1 and u is Tx1. We are using T and not N for sample si...

Author: Benedict Elliott

2 downloads 0 Views 144KB Size

Report

Download PDF

Recommend Documents

Lecture 1: Stationary Time Series

Time Series Lecture 7: Fitting and Predicting AR(I)MA

Lecture 13: Graphs I: Breadth First Search

Lecture 1 Empirical regularities and time-series methods

LECTURE 4 Time-Series Analysis in the Frequency Domain

Lecture V Advanced Spectral Methods for Time Series Analysis

Lecture 13 Chapter 26

Lecture 13: String Matching

CE Lecture 13

PHYSICS 149: Lecture 13

Lecture 13 Probabilistic Complexity

Lecture 13. Ascomycota III

Lecture 13: Security Architecture

Lecture 21 November 13

6.858 Lecture 13 Kerberos

Lecture # 13 Light Acclimatization

Physic 231 Lecture 13

Lecture 13. Dummy variables

Overview of Lecture Series

THEATRE PLUS: LECTURE SERIES

Distinguished Lecture Series

CAM System Lecture Series

CAM Software Lecture Series

PROFESSORIAL INAUGURAL LECTURE SERIES

LECTURE 13: TIME SERIES I AUTOCORRELATION: Consider y = Xβ + u where y is Tx1, X is TxK, β is Kx1 and u is Tx1. We are using T and not N for sample size to emphasize that this is a time series. The natural order of observations in a time series suggests possible approaches to parametrizing the covariance matrix parsimoniously. First order autoregression: AR(1) This is the case where ut = ρut-1 + εt where εt are independent and identically distributed with Eεt = 0 and V(εt) = σ2. N.M. Kiefer, Cornell University, Economics 620, Lecture 13

1

First order moving average: MA(1) This is the case where ut = εt - θεt-1. Random walk: (AR(1) with p = 1) This is the case where ut - ut-1 = εt. Integrated moving average: IMA(1) This is the case where ut - ut-1 = εt - θεt-1 . Autoregressive Moving Average(1,1): ARMA(1,1) ut - ρut-1 = εt - θεt-1 N.M. Kiefer, Cornell University, Economics 620, Lecture 13

2

Autoregressive of order p: AR(p) ut = ρ1ut-1 + ρ2ut-2 + ...+ ρput-p + εt. Moving Average of order p: MA(p) ut = εt -

p

∑ θ i ε t−i i =1

Proposition: A first order autoregressive (AR(1)) process is an infinite order moving average (MA(∞)) process. Proof: ut = ρ(ρut-2 + εt-1) + εt = (εt + ρεt-1 + ρ2εt-2 + ...). Thus

ut =

∞

r ρ ∑r=0 ε t − r

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

3

AR(1) arises frequently in economic time series. Let ut = ρut-1 + εt which is an AR(1) process. Note that Eut = 0 and V(ut) = σ2(1 + ρ2 + ρ4 + ...) = σ2/(1 - ρ2). Also note that cov(utut-1) = ρσ2 + ρ3σ2 + ρ5σ2 + ... = ρσ2/(1 - ρ2) = ρV(ut), and similarly cov(utut-s) = ρsV(ut) = ρsσ2/(1 - p2). Thus

1   σ 2 ρ Euu ′ = 1 − ρ 2 .  ρ T-1 

ρ 1 . ρ T-2

. . . ρ T-1   ρ . . . ρ T-1   . ... .   ρ T − 3 . . . 1 

ρ2

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

4

This is a symmetric matrix. This is a variance-covariance matrix characterized by two parameters which fits into the GLS framework. Consider the LS estimator β$ under the assumption of an AR(1) process for the ut's: 1. What are the properties of β$ ? 2. What is the associated variance estimate? In the LS method, V( β$ ) is estimated by s2(X′X)-1. Is this correct in the AR case?

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

5

Under the assumption of an AR(1) error process, V( β$ ) should be (σ2/(1 - ρ2))(X′X)-1X′VX(X′X)-1 with V representing the variance-covariance matrix above. If X variables are trending up and ρ > 0 (usually ≈ 0.8 or 0.9), then s2 will probably underestimate σ2/(1 - ρ2) and (X′X)-1 < (X′X)-1X′VX(X′X)-1. Point: We can seriously understate standard errors if we ignore autocorrelation.

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

6

"SPURIOUS REGRESSIONS IN ECONOMETRICS": (Granger-Newbold) (Journal of Econometrics, 1974) Consider a simple regression model. Let yt = α + βxt + εt. Suppose the true processes with ε and ε* independent are yt = ρyt-1 + εt and xt = ρ*xt-1 +ε.*t The data are really independent AR(1) processes.

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

7

Suppose we regress y on x. Then if T = 20 and ρ = ρ* = 0.9, then ER2 = 0.47 and F ≈ 18. This falsely indicates a significant contribution of x. Sampling experiments for yt = α + βxt + εt with T = 50 and y,x independent random walks were carried out, and t-statistics on β in 100 trials were calculated. If these statistics were actually distributed as t, we would expect t to be less than 2, 95 times. We actually observe t less than 2, 23 times, and t greater than 2, 77 times. There is spurious significance. The situation only becomes worse with more regressors. Point: High R2 does not "balance out" the effects of autocorrelation. Good time-series fits are not to be believed without diagnostic tests. N.M. Kiefer, Cornell University, Economics 620, Lecture 13

8

TESTING FOR AUTOCORRELATION: The important thing is to look at the residuals. Definition: The Durbin-Watson statistic ("d" or "DW") is 2 (e e ) − ∑ t=2 t t −1 T

d = where

2 e ∑ t =1 t T

e ′Ae = e ′e

 1 - 1 0 .   -1 2 - 1 . A =    0 - 1 2 .   . . .   . which is a TxT symmetric matrix. N.M. Kiefer, Cornell University, Economics 620, Lecture 13

9

In other words, d is the sum of squared successive differences divided by sum of squares. The Durbin-Watson statistic is probably the most commonly used test for autocorrelation, although the Durbin h-statistic is appropriate in wider circumstances and should usually be calculated as well. Distribution of d: Note: We want to calculate the distribution under the hypothesis that ρ = 0, i.e. no autocorrelation. Then, a surprisingly large value indicates autocorrelation.

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

10

Intuition: E(εt - εt-1)2 = σ2 + σ2 - 2cov(εt,εt-1) = 2σ2. So d ≈ (2σ2/σ2) = 2. Then, why is Ed ≠ 2? 1. There is one less term in the numerator 2. The use of e rather than ε makes the distribution depend on x. Note: d is a ratio of quadratic forms in normals. Why isn't it distributed as F?

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

11

Durbin-Watson test: Durbin and Watson give bounds dL and dU which are both less than 2. If d < dL, then reject the null hypothesis of no autocorrelation. This indicates positive autocorelation. If d > dU, do not reject. If dL < d < dU, then the result is ambiguous. If the statistic d calculated from the sample is greater than 2, the indication is negative autocorrelation. Then use the bounds dL and dU, and check against 4 - d. If 4 - d < dL, then reject the null. If 4 - d > dU, then do not reject. N.M. Kiefer, Cornell University, Economics 620, Lecture 13

12

Interpretation of the Durbin-Watson test: 1. This is a test for general autocorrelation, not just for AR(1) processes. 2. This test cannot be used when regressors include lagged values of y, for example, yt = α + β0yt-1 + β1xt + εt. Other tests are available in this case.

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

13

Other tests: 1. Wallis test: This is used for quarterly data. The test statistic is T 2 e e ( ) − ∑ t t −4 t=5 d4 = . T 2 ∑ t =1 e t 2. Durbin's h test: This is used when there are lagged y's. We regress et on et-1, xt and as many lagged y's as are included in the regression. Then test (with "t") the coefficient of et-1. A significant coefficient on et-1 indicates presence of autocorrelation. Note that this test is quite easy to do and it "works" when the DurbinWatson test doesn't. This is a good test to use.

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

14

ESTIMATION WITH AN AR(1) ERROR PROCESS: Consider y = Xβ+u where ut = ρut-1 + εt with E(u) = 0 and 1   σ 2 ρ Euu ′ = 1 − ρ2 .  ρT-1 

ρ 1 . ρT-2

. . . ρT-1   T-2  2 ρ ... ρ σ  = Ω. 2 1− ρ . ... .   ρT− 3 . . . 1 

ρ2

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

15

Thus

Ω −1 =

1 1 - ρ2

-ρ .. . 0 1   2 -ρ 1 + ρ .. . - ρ   . . .. . .  = P ′P   -ρ . . . 1 + ρ 2 - ρ   0 . .. -ρ 1  

which is a "band" matrix. So,

P =

1 1 - ρ2

 1 - ρ2 0 . . .   -ρ 1 .. .   0 -ρ . . .   . . .. .   . . . . -ρ 

.  .  . .  .  1 

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

16

Matrix P will be used to transform the model. The first transformed observation is

1 − ρ y1 = 2

2 2 β − ρ − ρ x 1 + u 1 , ∑h=1 h h,1 1 K

and all others are y t − ρ y t −1 =

∑ h=1 β h ( x h ,t K

− ρx h , t −1 ) + u t - ρu t-1 .

Note that xh,t denotes the tth observation on the hth explanatory variable. The GLS transformation puts the model back in standard form as expected.

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

17

Notes: 1. Given ρ, the estimation is by the LS method. We write the sum of squares as S(ρ). Then minimization with respect to ρ is a simple numerical problem. 2. ML can also be reduced to a one-dimensional maximization problem which is straightforward. 3. Early two-step methods which often dropped the first observation are less satisfactory. Never use the Cochrane-Orcutt (CORC) procedure. 4. The extension to higher-order AR or MA processes is straightforward.

N.M. Kiefer, Cornell University, Economics 620, Lecture 13

18