Part I The Classical Multiple Regression Model

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Assumption

The model y=xi1 β1 + xi2 β2 + · · · + xiK βK + εi , y=x1 β1 + x2 β2 + · · · + xK βK + ε

Zhou Yahong

SHUFE

i = 1, 2, ..., n.

The Classical Multiple Regression Model

Assumption Assumption 1 y = Xβ + ε where X is an n by K matrix, for the ith observation yi = x0i β+εi with X =

Zhou Yahong

x11 x12 ... x1K x21 x22 ... x2K . . . . . . . . . xn1 xK 2 ... xnK

SHUFE

The Classical Multiple Regression Model

Assumption

Assumption 2: Full rank X is an n by K matrix with rank K –the columns of X are linearly independent.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Assumption Assumption 3: Regression model E (εi |xi ) = 0 or E (ε|X) = 0 then, we have Cov(εi , xi ) = 0—–using law of iterated expectation.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Assumption Assumption 3: Regression model E (εi |xi ) = 0 or E (ε|X) = 0 then, we have Cov(εi , xi ) = 0—–using law of iterated expectation. Compare E (εi |xi ) = 0 with E (εi xi ) = 0.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Assumption Assumption 3: Regression model E (εi |xi ) = 0 or E (ε|X) = 0 then, we have Cov(εi , xi ) = 0—–using law of iterated expectation. Compare E (εi |xi ) = 0 with E (εi xi ) = 0. We can derive the finite sample properties under E (εi |xi ) = 0, but much more difficult under the assumption E (εi xi ) = 0, in which case we typically rely on large sample results. Namely, under E (εi |xi ) = 0, we have some desirable properties, which would hold approximately under E (εi xi ) = 0, but the approximate error becomes smaller and smaller as the sample size increases. Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Assumption

Assumption 4: Spherical disturbance E (εε0 |X) = σ 2 I assume homoscedasticity and no serial correlation.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Assumption

Assumption 5:Nonstochastic Regressors For a given observed value of X we may observe many possible values of Y .

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Assumption

Assumption 6:Normality ε|X ∼N(0, σ 2 I )

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Theory of Least squares

Model y = Xβ + ε where y = (y1 , y2 , ...yn )0 , X = (x1 , x2 , ..., xK ),

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Theory of Least squares

Model y = Xβ + ε where y = (y1 , y2 , ...yn )0 , X = (x1 , x2 , ..., xK ), Definition—minimizes S(β0 ) = (y − Xb)0 (y − Xb) = y 0 y − 2y 0 Xb + b 0 X 0 Xb

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Theory of Least squares

then

∂S(b) = −2X 0 y + 2X 0 Xb ∂b

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Theory of Least squares

then

∂S(b) = −2X 0 y + 2X 0 Xb ∂b

Solution: b = (X 0 X )−1 X 0 y yˆi = xi0 b

εi = yi − xi0 β

e = y − Xb and ei = yi − xi0 b

y = Xb + e = yˆ + e = Py + My where P = X (X 0 X )−1 X 0 and M = I − P,

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Theory of Least squares For the two variable model n X

ei2 =

i=1

n X

(yi − a − bxi )2

i=1

minimization leads to P n ∂( ni=1 ei2 ) X = 2(yi − a − bxi ) = 0 ∂a i=1

and P n X ∂( ni=1 ei2 ) =− 2xi (yi − a − bxi ) = 0 ∂b i=1

Zhou Yahong

SHUFE

=⇒

n X

ei = 0

i=1

=⇒

n X

xi e i = 0

i=1

The Classical Multiple Regression Model

Theory of Least squares

Solving two equations with two unknowns: a = y¯ − x¯b and

Pn (y − y¯ )(xi − x¯) Pn i b = i=1 ¯)2 i=1 (xi − x

The least squares formulas are much more complicated in three-variable model–three equations with three unknowns.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Theory of Least squares

Algebraic Aspects of the least squares solution: the normal equation X 0 Xb − X 0 y = −X 0 e = 0

K equations with K unknowns

hence for every column xk of X , x0k e = 0

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Theory of Least squares

Algebraic Aspects of the least squares solution: the normal equation X 0 Xb − X 0 y = −X 0 e = 0

K equations with K unknowns

hence for every column xk of X , x0k e = 0 Interpretation of the coefficients—the two-variable case, three variable case, and the more general case.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties If the first column of X is i, then

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties If the first column of X is i, then P The least squares residuals sum to zero ei = 0. —— in the case of two variable model, the existence of the intercept is to adjust the level, whereas the slope parameter rotates to get the best fit.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties If the first column of X is i, then P The least squares residuals sum to zero ei = 0. —— in the case of two variable model, the existence of the intercept is to adjust the level, whereas the slope parameter rotates to get the best fit. The regression hyperplane passes through the point of means of the data y¯ = x¯0 b

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties If the first column of X is i, then P The least squares residuals sum to zero ei = 0. —— in the case of two variable model, the existence of the intercept is to adjust the level, whereas the slope parameter rotates to get the best fit. The regression hyperplane passes through the point of means of the data y¯ = x¯0 b

The mean of the fitted values from the regression equals the mean of the actual values since ˆ y = Xb

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties If the first column of X is i, then P The least squares residuals sum to zero ei = 0. —— in the case of two variable model, the existence of the intercept is to adjust the level, whereas the slope parameter rotates to get the best fit. The regression hyperplane passes through the point of means of the data y¯ = x¯0 b

The mean of the fitted values from the regression equals the mean of the actual values since ˆ y = Xb

Note that none of these results need to hold if the regression does not contain a constant term. Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Some useful expressions e = y − Xb = y − X(XX)−1 X0 y = (I − X(XX)−1 X)y = My

yˆ = y − e = (I − M)y = Py or y = Xb + e = yˆ + e = Py + My where M = (I − X(XX)−1 X) Zhou Yahong

SHUFE

and P = X(XX)−1 X The Classical Multiple Regression Model

Algebraic Properties

and MX = 0 P2 = P0 = P

and M 2 = M 0 = M

PM = MP = 0

and e 0 e = y 0 My = y 0 e = e 0 y e 0 e = y 0 y − b 0 X 0 Xb

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Partitioned regression: Let β = (β10 , β20 )0 ∈ R K1 +K2

b=(b01 , b20 )0 and X = (X1 , X2 ) then rewrite X 0 Xb = X 0 y as X10 X1 b1 + X10 X2 b2 = X10 y and X20 X1 b1 + X20 X2 b2 = X20 y Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

then b2 = (X20 X2 )−1 (X20 y − X20 X1 βˆ1 ) and b1 = (X10 M2 X1 )−1 X10 M2 y similarly, b2 = (X20 M1 X2 )−1 X20 M1 y where M2 = I − X1 (X10 X1 )−1 X1 and M1 = I − X2 (X20 X2 )−1 X2

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

Define X1∗ = M2 X1

and y1∗ = M2 y

thus b1 can be expressed as (X1∗0 X1∗ )−1 X1∗0 y1∗ and this form leads to some intuitive interpretation.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

Define X1∗ = M2 X1

and y1∗ = M2 y

thus b1 can be expressed as (X1∗0 X1∗ )−1 X1∗0 y1∗ and this form leads to some intuitive interpretation. Suppose X10 X2 = 0, then b1 = (X10 X1 )−1 X10 y

Zhou Yahong

SHUFE

and b2 = (X20 X2 )−1 X20 y

The Classical Multiple Regression Model

Algebraic Properties

Goodness of Fit and the analysis of variance—–How well does the model fit the data?

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

Goodness of Fit and the analysis of variance—–How well does the model fit the data? P — ei2 is useful, but needs a scale factor.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Uncentered R 2 —One measure of the variability P 2 of 0the dependent variable is the sum of squares, yi = y y , note that y 0y

= (ˆ y + e)0 (ˆ y + e) = yˆ 0 yˆ + 2b 0 X 0 e + e 0 e = yˆ 0 yˆ + e 0 e 2 Ruc

(since X 0 e = 0 by the normal equations) Pn yˆi 2 y 0 Py e 0e yˆ 0 yˆ (R1) = 0 = 0 = 1 − 0 = 1 − Pi=1 n 2 yy yy yy i=1 yi

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Uncentered R 2 —One measure of the variability P 2 of 0the dependent variable is the sum of squares, yi = y y , note that y 0y

= (ˆ y + e)0 (ˆ y + e) = yˆ 0 yˆ + 2b 0 X 0 e + e 0 e = yˆ 0 yˆ + e 0 e 2 Ruc

(since X 0 e = 0 by the normal equations) Pn yˆi 2 y 0 Py e 0e yˆ 0 yˆ (R1) = 0 = 0 = 1 − 0 = 1 − Pi=1 n 2 yy yy yy i=1 yi

2 We have 0 ≤ Ruc ≤ 1, which has the interpretation of the fraction of the variation of the dependent variable that is attributable to the variation in the explanatory variables.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties (Centered) R 2 –the coefficient of determination—If the only regressor is a constant (K = 1 and xi = 1) then b = y¯ , yˆ 0 yˆ = n¯ y2

and e 0 e =

n X

(yi − y¯ )2

i=1

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties (Centered) R 2 –the coefficient of determination—If the only regressor is a constant (K = 1 and xi = 1) then b = y¯ , yˆ 0 yˆ = n¯ y2

and e 0 e =

n X

(yi − y¯ )2

i=1

more generally yi − y¯ = yˆi − y¯ + ei n X

(yi − y¯ )2

= y 0 M 0 y = y 0 M 0 Py + y 0 M 0 My

i=1

= y 0 PM 0 Py + y 0 My n n X X = (ˆ yi − y¯ )2 + ei2 i=1

Zhou Yahong

SHUFE

i=1

The Classical Multiple Regression Model

Algebraic Properties namely SST = SSR + SSE when the first column of X is i, M 0 M = M and M 0 P = PM 0

since X (X 0 X )−1 X i = i, i.e Pi = i and Mi = 0 Pn Pn (ˆ yi − y¯ )2 e2 R 2 = 1 − Pn i=1 i 2 = Pi=1 n ¯) ¯ )2 i=1 (yi − y i=1 (yi − y

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

(R2)

Algebraic Properties namely SST = SSR + SSE when the first column of X is i, M 0 M = M and M 0 P = PM 0

since X (X 0 X )−1 X i = i, i.e Pi = i and Mi = 0 Pn Pn (ˆ yi − y¯ )2 e2 R 2 = 1 − Pn i=1 i 2 = Pi=1 n ¯) ¯ )2 i=1 (yi − y i=1 (yi − y

(R2)

Thus, this R 2 is a measure of the explanatory power of the nonconstant regressors.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

Alternatively, R2 = =

e 0e y 0 My y 0 PM 0 Py = 1 − = 1 − yM 0 y yM 0 y yM 0 y yˆ 0 M 0 yˆ (ˆ y 0 M 0 yˆ )2 (ˆ y 0 M 0 y )2 = = = corr(y , yˆ ) yM 0 y yM 0 y (ˆ y 0 M 0 yˆ ) yM 0 y (ˆ y 0 M 0 yˆ )

as yˆ 0 M 0 yˆ = (y − e)0 M 0 yˆ = y 0 M 0 y .

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

(1)

Algebraic Properties

Alternatively, R2 = =

e 0e y 0 My y 0 PM 0 Py = 1 − = 1 − yM 0 y yM 0 y yM 0 y yˆ 0 M 0 yˆ (ˆ y 0 M 0 yˆ )2 (ˆ y 0 M 0 y )2 = = = corr(y , yˆ ) yM 0 y yM 0 y (ˆ y 0 M 0 yˆ ) yM 0 y (ˆ y 0 M 0 yˆ )

as yˆ 0 M 0 yˆ = (y − e)0 M 0 yˆ = y 0 M 0 y .

(1)

indicates the co-movement of y and yˆ around the mean, which can be illustrated graphically.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

If the regressors do not include a constant but you nevertheless calculate R 2 using (R2), then R 2 can be negative. This is because, without the benefit of an intercept, the regression could do worse than the sample mean in terms of tracking the dependent variable.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

2 and R 2 A relationship between Ruc

n¯ y2 2 )(1 − Run ) 2 (y − y ¯ ) i i=1

1 − R 2 = (1 + Pn

2 ≥ R 2 if we run two equivalent regressions, which indicates Ruc with or without the intercept term—–Four seasonal dummies 2 vs three dummies plus an intercept—STATA switches to Ruc when a constant is not included.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

2 and R 2 A relationship between Ruc

n¯ y2 2 )(1 − Run ) 2 (y − y ¯ ) i i=1

1 − R 2 = (1 + Pn

2 ≥ R 2 if we run two equivalent regressions, which indicates Ruc with or without the intercept term—–Four seasonal dummies 2 vs three dummies plus an intercept—STATA switches to Ruc when a constant is not included.

Two extreme cases: R 2 = 0, 1.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Some Comments on R 2 :

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Some Comments on R 2 : Intuitively, it would seem that some lines are a good fit of the data while other lines are not such a good fit.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Some Comments on R 2 : Intuitively, it would seem that some lines are a good fit of the data while other lines are not such a good fit. Intuitively, we start with the basic idea that our dependent variable varies across observations. Our model purports to explain part of this variation. For example, different people received different grades on their econometrics midterm examination. That is, the grades vary. We explain part of the variation in grades by noting that different people studied for different durations. We might ask: What fraction of the variation in grades can be explained by the hours studied. X X (Yi − Y¯ )2 versus (Yˆi − Y¯ )2

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

Notice that ordinary least squares maximizes the value of R 2 .

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

Notice that ordinary least squares maximizes the value of R 2 . One should not place too much importance on obtaining a high value. The R 2 can be influenced by factors such as the nature of the data in a way that indicates one should not use if as the sole judge of the quality of the econometric model.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

¯ 2 does not mean that the regressors are a true A high R 2 or R cause of the dependent variable.—imagine regressing test scores against parking lot area per pupil and (with a high R 2 , try telling the superintendent that the way to increase test scores is to increase parking space!)

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

¯ 2 does not mean that the regressors are a true A high R 2 or R cause of the dependent variable.—imagine regressing test scores against parking lot area per pupil and (with a high R 2 , try telling the superintendent that the way to increase test scores is to increase parking space!) In time-series studies, one often obtains high values of R 2 because any variable that grows over time is likely to do a good job of explaining the variation of any other variable that grows over time.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

Problems with R 2

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

Problems with R 2 R 2 will never decrease when another variable is added to a regression equation, so we define ¯ 2 = 1 − yMy /(n − K ) = 1 − n − 1 (1 − R 2 ) R yM 0 y /(n − 1) n−K

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

Problems with R 2 R 2 will never decrease when another variable is added to a regression equation, so we define ¯ 2 = 1 − yMy /(n − K ) = 1 − n − 1 (1 − R 2 ) R yM 0 y /(n − 1) n−K Constant term is needed in the regression for easy interpretation

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties

Comparison based on R 2 is not meaningful when different dependent variables are involved (linear v.s. log models)—two models Y = β1 + β2 X2 + β3 X3 + ε and Y − X2 = β10 + β20 X2 + β30 X3 + ε the same model, but different R 2 .

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Example

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Example model I Ct ˆt C t

= α1 + β1 Yt + ε1t = −2161 + 0.93Yt = (−5.75) (562)

Zhou Yahong

SHUFE

R 2 = 0.9995

The Classical Multiple Regression Model

Algebraic Properties Example model I Ct ˆt C t

= α1 + β1 Yt + ε1t = −2161 + 0.93Yt = (−5.75) (562)

R 2 = 0.9995

St Sˆt t

= α2 + β2 Yt + ε2t = 21.61 + 0.07Yt = (5.75) (44.37)

R 2 = 0.923

model II

note that the t-statistics are testing H0 : β = 0. Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties model III Ct ˆt C t

= α2 + β3 Yt + γ3 Ct−1 + ε3t = −.066 + 0.18Yt + 0.81Ct−1 = (−0.047) (8.58) (35.44)

Zhou Yahong

SHUFE

R 2 = 0.9999

The Classical Multiple Regression Model

Algebraic Properties model III Ct ˆt C t

= α2 + β3 Yt + γ3 Ct−1 + ε3t = −.066 + 0.18Yt + 0.81Ct−1 = (−0.047) (8.58) (35.44)

R 2 = 0.9999

Ct−1 is included to allow current consumption to depend on recent consumption behavior as well as income and the coefficient of the disposable income must be interpreted differently from model I.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties model III Ct ˆt C t

= α2 + β3 Yt + γ3 Ct−1 + ε3t = −.066 + 0.18Yt + 0.81Ct−1 = (−0.047) (8.58) (35.44)

R 2 = 0.9999

Ct−1 is included to allow current consumption to depend on recent consumption behavior as well as income and the coefficient of the disposable income must be interpreted differently from model I. long-run marginal propensity to consume is 0.18/(1 − 0.81), further note that a large portion of Yt is the permanent which would affect Ct−1 , in a sense, we are hold the permanent income constant, 0.18/(1 − 0.81) = 0.18 + 0.182 + ... The impact of the delayed effects. Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Notice that the word “significant” has a very specific statistical definition and does not mean “big.” It means that the estimate is far enough from zero that we are willing to conclude that the “true” slope coefficient is, in fact, not zero.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Notice that the word “significant” has a very specific statistical definition and does not mean “big.” It means that the estimate is far enough from zero that we are willing to conclude that the “true” slope coefficient is, in fact, not zero. The general principal—model simplicity vs good-of-fitness.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Algebraic Properties Notice that the word “significant” has a very specific statistical definition and does not mean “big.” It means that the estimate is far enough from zero that we are willing to conclude that the “true” slope coefficient is, in fact, not zero. The general principal—model simplicity vs good-of-fitness. Amemiya’s prediction criterion PCj = based on

ej0 ej n − Kj

(1 +

Kj ) n

˜ j2 = 1 − n + Kj (1 − Rj2 ) R n − Kj

with the notion that the adjusted R-squared does not penalize the loss of degrees of freedom heavily enough. Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Statistical Properties Statistical Properties: b = (X 0 X )−1 X 0 y = β + (X 0 X )−1 X 0 ε thus Eb = b and

Var(b) = σ 2 (X 0 X )−1

b is linear and unbiased

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Statistical Properties Statistical Properties: b = (X 0 X )−1 X 0 y = β + (X 0 X )−1 X 0 ε thus Eb = b and

Var(b) = σ 2 (X 0 X )−1

b is linear and unbiased In a two-variable model 2

ˆ =P σ var (β) (xi − x¯)2

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Statistical Properties Statistical Properties: b = (X 0 X )−1 X 0 y = β + (X 0 X )−1 X 0 ε thus Eb = b and

Var(b) = σ 2 (X 0 X )−1

b is linear and unbiased In a two-variable model 2

ˆ =P σ var (β) (xi − x¯)2 Gauss-Markov Theorem: b is BLUE

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Statistical Properties Statistical Properties: b = (X 0 X )−1 X 0 y = β + (X 0 X )−1 X 0 ε thus Eb = b and

Var(b) = σ 2 (X 0 X )−1

b is linear and unbiased In a two-variable model 2

ˆ =P σ var (β) (xi − x¯)2 Gauss-Markov Theorem: b is BLUE ¯. Can viewed as the extention of µ = EX by X

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Statistical Properties

if b0 = Cy is also unbiased, then Var(b0 ) =Var(b) + G , with G being nonnegative definite. assuming

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Statistical Properties

if b0 = Cy is also unbiased, then Var(b0 ) =Var(b) + G , with G being nonnegative definite. assuming since C 0X = I by the unbiasedness, let D = C − (X 0 X )−1 X 0 so Dy = b0 − b and DX = 0, Vb0

= σ 2 CC 0 = σ 2 [(X 0 X )−1 X 0 + D][(X 0 X )−1 X 0 + D]0 = σ 2 (X 0 X )−1 + σ 2 DD 0

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Statistical Properties

Normality and the distribution of b

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Statistical Properties

Normality and the distribution of b b ∼ N(β, σ 2 (X 0 X )−1 ) and bk ∼ N(β, σ 2 (X 0 X )−1 kk ) b is the best unbiased estimator

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Statistical Properties

Estimating σ 2 and the variance of b e = My = M(X β + ε) = Mε e 0 e = ε0 Mε

and Ee 0 e

= trace(E (ε0 Mε)) = trace(E (Mεε0 )) = trace(E (σ 2 MI ))

but trace(M) = tr(In − X (X 0 X )−1 X 0 ) = n − K

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Statistical Properties

therefore E

e 0e = σ2 n−K

and an unbiased estimator for σ 2 is s2 =

e 0e n−K

and est. Var[b] = s 2 (X 0 X )−1

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing We cover some classical tests in this subsection–two equivalent approaches. F , t-Tests

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing We cover some classical tests in this subsection–two equivalent approaches. F , t-Tests Wald principle—estimation first, and then checking the null hypothesis directly.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing We cover some classical tests in this subsection–two equivalent approaches. F , t-Tests Wald principle—estimation first, and then checking the null hypothesis directly. Testing a hypothesis about a coefficient

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing We cover some classical tests in this subsection–two equivalent approaches. F , t-Tests Wald principle—estimation first, and then checking the null hypothesis directly. Testing a hypothesis about a coefficient Let S kk denote the kth diagonal element of (X 0 X )−1 , then bk − βk ∼ N(0, 1) zk = √ σ 2 S kk and ε 0 ε n − K 2 e 0e s = = M ∼ χ2 (n − K ) σ2 σ2 σ σ n − K = rank(M)=tr(M) Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing

Independence of b and s 2 Cov(b, e) = E (X 0 X )−1 X 0 εε0 M = 0 and b,e are normally distributed and independent of each other as b = β + (X 0 X )−1 X 0 ε, e = Mε, and MX = 0.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing

Independence of b and s 2 Cov(b, e) = E (X 0 X )−1 X 0 εε0 M = 0 and b,e are normally distributed and independent of each other as b = β + (X 0 X )−1 X 0 ε, e = Mε, and MX = 0. t-test √ (bk − βk )/ σ 2 S kk

(bk − βk ) tk = p = √ ∼ t(n−K ) [(n − K )s 2 /σ 2 ]/(n − K ) s 2 S kk

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing

Test a linear restriction r 0 β = c0 where r is a known vector and c0 is a known scalar. Since r 0 b ∼ N(r 0 β, σ 2 r 0 (X 0 X )−1 r ) and

r 0 b − c0 p ∼ t(n−K ) σ ˆ r 0 (X 0 X )−1 r

under the null hypothesis

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing Test a set of linear restrictions—null hypothesis Rβ = q where q is J × 1, then R 0 b ∼ N(Rβ, σ 2 R(X 0 X )−1 R 0 )

(Rb − q)0 [σ 2 R(X 0 X )−1 R 0 ]−1 (Rb − q) ∼ χ2 (J) and

e 0e ∼ χ2 (n − K ) σ2

and b is independent of

Zhou Yahong

e 0e . σ2 SHUFE

The Classical Multiple Regression Model

Hypothesis Testing

Thus (Rb − q)0 [σ 2 R(X 0 X )−1 R 0 ]−1 (Rb − q)/J e0e /(n − K ) σ2 = (Rb − q)0 [ˆ σ 2 R(X 0 X )−1 R 0 ]−1 (Rb − q)/J ∼ FJ,(n−K )

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing Restricted least squares vs. the unrestricted model

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing Restricted least squares vs. the unrestricted model Unrestricted model y = Xβ + ε min S(β) = (y − X β)0 (y − X β) β

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing Restricted least squares vs. the unrestricted model Unrestricted model y = Xβ + ε min S(β) = (y − X β)0 (y − X β) β

Restricted model y = Xβ + ε

and

min S(β) = (y − X β)0 (y − X β) β

Rβ = q

subject to Rβ = q

can be solved by the method of Lagrangian multiplier through the Lagrangian (y − X β)0 (y − X β) + λ0 (Rβ − q)

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing The restricted OLS b∗ = b − (X 0 X )−1 R 0 [R(X 0 X )−1 R 0 ]−1 [Rb − q] with Var(b∗ ) = σ 2 [(X 0 X )−1 −(X 0 X )−1 R 0 [R 0 (X 0 X )−1 R]−1 R(X 0 X )−1 ]

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing The restricted OLS b∗ = b − (X 0 X )−1 R 0 [R(X 0 X )−1 R 0 ]−1 [Rb − q] with Var(b∗ ) = σ 2 [(X 0 X )−1 −(X 0 X )−1 R 0 [R 0 (X 0 X )−1 R]−1 R(X 0 X )−1 ] Let e∗ = y − Xb∗ = e − X (b∗ − b) and e∗0 e∗ = e 0 e + (b∗ − b)0 X 0 X (b∗ − b) e∗0 e∗ − e 0 e = (Rb − q)0 [R(X 0 X )−1 R 0 ]−1 (Rb − q) Thus F (J, n − K ) = =

Zhou Yahong

SHUFE

(e∗0 e∗ − e 0 e)/J e 0 e/(n − K ) (R 2 − R∗2 )/J (1 − R 2 )/(n − K )

The Classical Multiple Regression Model

Hypothesis Testing

From here, we can also see that R 2 increases when more variables are added.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing

From here, we can also see that R 2 increases when more variables are added. For testing the significance of the regression F =

R 2 /J (1 − R 2 )/(n − K )

since R∗2 = 0.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing Examples in estimating production function

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing Examples in estimating production function Labor Elasticity equal to 1 (testing a single coefficient)

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing Examples in estimating production function Labor Elasticity equal to 1 (testing a single coefficient) Constant returns to scale (testing two coefficients summing to 1)

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing Examples in estimating production function Labor Elasticity equal to 1 (testing a single coefficient) Constant returns to scale (testing two coefficients summing to 1) Translog production function 1 1 ln Y = β1 +β2 ln L+β3 ln K +β4 ( ln2 L)+β5 ( ln2 K )+β6 ln L ln K +ε 2 2 which relaxes the unitary elasticity of substitution. The Cobb-Douglas model is obtained by the restriction β3 = β4 = β5 = 0.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing Examples in estimating production function Labor Elasticity equal to 1 (testing a single coefficient) Constant returns to scale (testing two coefficients summing to 1) Translog production function 1 1 ln Y = β1 +β2 ln L+β3 ln K +β4 ( ln2 L)+β5 ( ln2 K )+β6 ln L ln K +ε 2 2 which relaxes the unitary elasticity of substitution. The Cobb-Douglas model is obtained by the restriction β3 = β4 = β5 = 0. Table 6.2.where F [3, 21] = 1.768, with the critical value for the F statistic, we would not reject the hypothesis that a Cobb-Douglas model is appropriate, but the coefficients differ greatly, why? Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Hypothesis Testing

The answer lies in the interpretation of the coefficients—–namely, the marginal effects for each unit change of x, in the translog model ∂ ln Y = β3 + β5 ln K + β6 ln L ∂ ln K at the mean of ln K and ln L (7.4459 and 5.7637) ∂ ln Y ∂ ln K = 0.5425.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Examples—before and after WTO (trade figures), company performance before and after being listed, consumption function during (rationing) and after the war, production (and consumption pattern) before and after oil shock, coastal vs. interior region

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Case I: Different coefficients for different subsets of the data (different periods or region)

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Case I: Different coefficients for different subsets of the data (different periods or region) Restriction–all the coefficients are the same

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Case I: Different coefficients for different subsets of the data (different periods or region) Restriction–all the coefficients are the same Null hypothesis—The same coefficients are the same

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Case I: Different coefficients for different subsets of the data (different periods or region) Restriction–all the coefficients are the same Null hypothesis—The same coefficients are the same Alternative—all the coefficients can be different

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

The unrestricted model y1 X1 β1 ε1 = + y2 X2 β2 ε2

Unrestriced Model Same slopes

and e 0 e = e10 e1 + e20 e2

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

The unrestricted model y1 X1 β1 ε1 = + y2 X2 β2 ε2

Unrestriced Model Same slopes

and e 0 e = e10 e1 + e20 e2 The restricted model y1 X1 ε1 = β+ y2 X2 ε2

Restriced Model Same slopes

with the restriction β1 = β2 , where R = [I , I ] and q = 0. J = K.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Gasoline Consumption Equations.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Gasoline Consumption Equations. K = 6, n = 36, using F (J, n − 2k) = F (6, 24).

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Gasoline Consumption Equations. K = 6, n = 36, using F (J, n − 2k) = F (6, 24). Rejected—so coefficients have changed—then what’s next—at least some has changed

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same Restriction—the slope coefficients are the same

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same Restriction—the slope coefficients are the same Null hypothesis—change in the constant, but not the slopes, or the slopes remained the same

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same Restriction—the slope coefficients are the same Null hypothesis—change in the constant, but not the slopes, or the slopes remained the same Alternatives—both the constant or slopes can be different

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same Restriction—the slope coefficients are the same Null hypothesis—change in the constant, but not the slopes, or the slopes remained the same Alternatives—both the constant or slopes can be different constant terms are different,but the slopes are same i Wpre73 Differential intercepts XU = i Wpost73 Differential slopes and XR =

i i

Wpre73 Wpost73

Differential intercepts Common slopes

J=5

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same Restriction—the slope coefficients are the same Null hypothesis—change in the constant, but not the slopes, or the slopes remained the same Alternatives—both the constant or slopes can be different constant terms are different,but the slopes are same i Wpre73 Differential intercepts XU = i Wpost73 Differential slopes and XR =

i i

Wpre73 Wpost73

Differential intercepts Common slopes

J=5 Rejected—two periods are systematically different–beyond a simple shift in the constant term Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Case III: Change in a subset of coefficients—–constant, the price an income elasticities—J = 3,

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Case III: Change in a subset of coefficients—–constant, the price an income elasticities—J = 3, Restriction—Some Coefficient remain the same. XU =

i

Wpre73 i

Wpost73

Differential intercepts Differential slopes

and XR =

i

Zpre73 i

Zpost73

Wpre73 Wpost73

Some Common slopes

where Z denote the gasoline price and income variables, whose coefficients are believed to have changed.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

For all three cases, the same Unrestricted model

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

For all three cases, the same Unrestricted model R1: all the coefficients are the same;

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

For all three cases, the same Unrestricted model R1: all the coefficients are the same; R2: all the slope coefficients are the same;

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

For all three cases, the same Unrestricted model R1: all the coefficients are the same; R2: all the slope coefficients are the same; R3: only a subset of the slope coefficients are the same.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change The Chow’s Predictive test

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change The Chow’s Predictive test Instead of using all the sample observations for estimation, the suggested procedure is to divide the data set of n sample observations into n1 observations to be used for estimation and n2 = n − n1 observations for testing. There are no hard and fast rules for determining the relative sizes of n1 and n2 . It is not uncommon to reserve 5, 10 or 15 percent of the observations for testing.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change The Chow’s Predictive test Instead of using all the sample observations for estimation, the suggested procedure is to divide the data set of n sample observations into n1 observations to be used for estimation and n2 = n − n1 observations for testing. There are no hard and fast rules for determining the relative sizes of n1 and n2 . It is not uncommon to reserve 5, 10 or 15 percent of the observations for testing. OLS for the first subsample b1 = (X10 X1 )−1 X10 y

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change The Chow’s Predictive test Instead of using all the sample observations for estimation, the suggested procedure is to divide the data set of n sample observations into n1 observations to be used for estimation and n2 = n − n1 observations for testing. There are no hard and fast rules for determining the relative sizes of n1 and n2 . It is not uncommon to reserve 5, 10 or 15 percent of the observations for testing. OLS for the first subsample b1 = (X10 X1 )−1 X10 y a prediction for y2 yˆ2 = X2 b1

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

The prediction error d = y2 − yˆ2 = y2 − X2 b1 = ε2 − X2 (b1 − β) with var(d) = σ 2 [In2 + X2 (X1 X10 )−1 X20 ]

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

The prediction error d = y2 − yˆ2 = y2 − X2 b1 = ε2 − X2 (b1 − β) with var(d) = σ 2 [In2 + X2 (X1 X10 )−1 X20 ] Thus d 0 [var(d)]−1 d ∼ χ2 (n2 ) Further e10 e1 /σ 2 ∼ χ2 (n1 − K )

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Under the null hypothesis F =

d 0 [In2 + X2 (X1 X10 )−1 X20 ]−1 d/n2 ∼ F (n2 , n1 − K ) e10 e1 /(n1 − K )

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Under the null hypothesis F =

d 0 [In2 + X2 (X1 X10 )−1 X20 ]−1 d/n2 ∼ F (n2 , n1 − K ) e10 e1 /(n1 − K )

Equivalently F =

(e∗0 e∗ − e10 e1 )/n2 e10 e1 /(n1 − K )

where e∗ is the residual vector for the restricted model—when there is no structural change.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change

Another case for applying this approach—In the case of demand for gasoline—one possibility would be that consumers took a year or two to adjust to the turmoil of the two oil price shocks in 1973 and 1979, but that the market never actually fundamentally changed or that it only changed temporarily. We might consider the same test as before, but now only single out the four years 1974, 1975, 1980 and 1981 for special treatment.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change Prediction

0

y 0 = x 0 β + ε0 to predict the value of y 0 associated with a regressor vector x 0 , which is 0 yˆ 0 = x 0 b 0

e 0 = y 0 − yˆ 0 = x 0 (β − b) + ε0 0

Var[e 0 ] = σ 2 + σ 2 x 0 [X 0 X ]−1 x 0 Var[e 0 ] = σ 2 [1 +

K −1 K −1 1 XX 0 + (xj − x¯j )(xk0 − x¯k )(Z 0 M 0 Z )jk ] n j=1 k=1

where Z is the K − 1 columns of X not including the constant.

Zhou Yahong

SHUFE

The Classical Multiple Regression Model

Tests of Structural Change Prediction

0

y 0 = x 0 β + ε0 to predict the value of y 0 associated with a regressor vector x 0 , which is 0 yˆ 0 = x 0 b 0

e 0 = y 0 − yˆ 0 = x 0 (β − b) + ε0 0

Var[e 0 ] = σ 2 + σ 2 x 0 [X 0 X ]−1 x 0 Var[e 0 ] = σ 2 [1 +

K −1 K −1 1 XX 0 + (xj − x¯j )(xk0 − x¯k )(Z 0 M 0 Z )jk ] n j=1 k=1

where Z is the K − 1 columns of X not including the constant. forecast interval yˆ 0 ± tλ/2 se(e 0 ) the picture of the intervals whose widths depends on x 0 Zhou Yahong

SHUFE

The Classical Multiple Regression Model