The Classical Multiple Regression Model Zhou Yahong School of Economics SHUFE
Part I The Classical Multiple Regression Model
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Assumption
The model y=xi1 β1 + xi2 β2 + · · · + xiK βK + εi , y=x1 β1 + x2 β2 + · · · + xK βK + ε
Zhou Yahong
SHUFE
i = 1, 2, ..., n.
The Classical Multiple Regression Model
Assumption Assumption 1 y = Xβ + ε where X is an n by K matrix, for the ith observation yi = x0i β+εi with X =
Zhou Yahong
x11 x12 ... x1K x21 x22 ... x2K . . . . . . . . . xn1 xK 2 ... xnK
SHUFE
The Classical Multiple Regression Model
Assumption
Assumption 2: Full rank X is an n by K matrix with rank K –the columns of X are linearly independent.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Assumption Assumption 3: Regression model E (εi |xi ) = 0 or E (ε|X) = 0 then, we have Cov(εi , xi ) = 0—–using law of iterated expectation.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Assumption Assumption 3: Regression model E (εi |xi ) = 0 or E (ε|X) = 0 then, we have Cov(εi , xi ) = 0—–using law of iterated expectation. Compare E (εi |xi ) = 0 with E (εi xi ) = 0.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Assumption Assumption 3: Regression model E (εi |xi ) = 0 or E (ε|X) = 0 then, we have Cov(εi , xi ) = 0—–using law of iterated expectation. Compare E (εi |xi ) = 0 with E (εi xi ) = 0. We can derive the finite sample properties under E (εi |xi ) = 0, but much more difficult under the assumption E (εi xi ) = 0, in which case we typically rely on large sample results. Namely, under E (εi |xi ) = 0, we have some desirable properties, which would hold approximately under E (εi xi ) = 0, but the approximate error becomes smaller and smaller as the sample size increases. Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Assumption
Assumption 4: Spherical disturbance E (εε0 |X) = σ 2 I assume homoscedasticity and no serial correlation.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Assumption
Assumption 5:Nonstochastic Regressors For a given observed value of X we may observe many possible values of Y .
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Assumption
Assumption 6:Normality ε|X ∼N(0, σ 2 I )
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Theory of Least squares
Model y = Xβ + ε where y = (y1 , y2 , ...yn )0 , X = (x1 , x2 , ..., xK ),
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Theory of Least squares
Model y = Xβ + ε where y = (y1 , y2 , ...yn )0 , X = (x1 , x2 , ..., xK ), Definition—minimizes S(β0 ) = (y − Xb)0 (y − Xb) = y 0 y − 2y 0 Xb + b 0 X 0 Xb
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Theory of Least squares
then
∂S(b) = −2X 0 y + 2X 0 Xb ∂b
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Theory of Least squares
then
∂S(b) = −2X 0 y + 2X 0 Xb ∂b
Solution: b = (X 0 X )−1 X 0 y yˆi = xi0 b
εi = yi − xi0 β
e = y − Xb and ei = yi − xi0 b
y = Xb + e = yˆ + e = Py + My where P = X (X 0 X )−1 X 0 and M = I − P,
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Theory of Least squares For the two variable model n X
ei2 =
i=1
n X
(yi − a − bxi )2
i=1
minimization leads to P n ∂( ni=1 ei2 ) X = 2(yi − a − bxi ) = 0 ∂a i=1
and P n X ∂( ni=1 ei2 ) =− 2xi (yi − a − bxi ) = 0 ∂b i=1
Zhou Yahong
SHUFE
=⇒
n X
ei = 0
i=1
=⇒
n X
xi e i = 0
i=1
The Classical Multiple Regression Model
Theory of Least squares
Solving two equations with two unknowns: a = y¯ − x¯b and
Pn (y − y¯ )(xi − x¯) Pn i b = i=1 ¯)2 i=1 (xi − x
The least squares formulas are much more complicated in three-variable model–three equations with three unknowns.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Theory of Least squares
Algebraic Aspects of the least squares solution: the normal equation X 0 Xb − X 0 y = −X 0 e = 0
K equations with K unknowns
hence for every column xk of X , x0k e = 0
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Theory of Least squares
Algebraic Aspects of the least squares solution: the normal equation X 0 Xb − X 0 y = −X 0 e = 0
K equations with K unknowns
hence for every column xk of X , x0k e = 0 Interpretation of the coefficients—the two-variable case, three variable case, and the more general case.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties If the first column of X is i, then
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties If the first column of X is i, then P The least squares residuals sum to zero ei = 0. —— in the case of two variable model, the existence of the intercept is to adjust the level, whereas the slope parameter rotates to get the best fit.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties If the first column of X is i, then P The least squares residuals sum to zero ei = 0. —— in the case of two variable model, the existence of the intercept is to adjust the level, whereas the slope parameter rotates to get the best fit. The regression hyperplane passes through the point of means of the data y¯ = x¯0 b
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties If the first column of X is i, then P The least squares residuals sum to zero ei = 0. —— in the case of two variable model, the existence of the intercept is to adjust the level, whereas the slope parameter rotates to get the best fit. The regression hyperplane passes through the point of means of the data y¯ = x¯0 b
The mean of the fitted values from the regression equals the mean of the actual values since ˆ y = Xb
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties If the first column of X is i, then P The least squares residuals sum to zero ei = 0. —— in the case of two variable model, the existence of the intercept is to adjust the level, whereas the slope parameter rotates to get the best fit. The regression hyperplane passes through the point of means of the data y¯ = x¯0 b
The mean of the fitted values from the regression equals the mean of the actual values since ˆ y = Xb
Note that none of these results need to hold if the regression does not contain a constant term. Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Some useful expressions e = y − Xb = y − X(XX)−1 X0 y = (I − X(XX)−1 X)y = My
yˆ = y − e = (I − M)y = Py or y = Xb + e = yˆ + e = Py + My where M = (I − X(XX)−1 X) Zhou Yahong
SHUFE
and P = X(XX)−1 X The Classical Multiple Regression Model
Algebraic Properties
and MX = 0 P2 = P0 = P
and M 2 = M 0 = M
PM = MP = 0
and e 0 e = y 0 My = y 0 e = e 0 y e 0 e = y 0 y − b 0 X 0 Xb
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Partitioned regression: Let β = (β10 , β20 )0 ∈ R K1 +K2
b=(b01 , b20 )0 and X = (X1 , X2 ) then rewrite X 0 Xb = X 0 y as X10 X1 b1 + X10 X2 b2 = X10 y and X20 X1 b1 + X20 X2 b2 = X20 y Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
then b2 = (X20 X2 )−1 (X20 y − X20 X1 βˆ1 ) and b1 = (X10 M2 X1 )−1 X10 M2 y similarly, b2 = (X20 M1 X2 )−1 X20 M1 y where M2 = I − X1 (X10 X1 )−1 X1 and M1 = I − X2 (X20 X2 )−1 X2
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
Define X1∗ = M2 X1
and y1∗ = M2 y
thus b1 can be expressed as (X1∗0 X1∗ )−1 X1∗0 y1∗ and this form leads to some intuitive interpretation.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
Define X1∗ = M2 X1
and y1∗ = M2 y
thus b1 can be expressed as (X1∗0 X1∗ )−1 X1∗0 y1∗ and this form leads to some intuitive interpretation. Suppose X10 X2 = 0, then b1 = (X10 X1 )−1 X10 y
Zhou Yahong
SHUFE
and b2 = (X20 X2 )−1 X20 y
The Classical Multiple Regression Model
Algebraic Properties
Goodness of Fit and the analysis of variance—–How well does the model fit the data?
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
Goodness of Fit and the analysis of variance—–How well does the model fit the data? P — ei2 is useful, but needs a scale factor.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Uncentered R 2 —One measure of the variability P 2 of 0the dependent variable is the sum of squares, yi = y y , note that y 0y
= (ˆ y + e)0 (ˆ y + e) = yˆ 0 yˆ + 2b 0 X 0 e + e 0 e = yˆ 0 yˆ + e 0 e 2 Ruc
(since X 0 e = 0 by the normal equations) Pn yˆi 2 y 0 Py e 0e yˆ 0 yˆ (R1) = 0 = 0 = 1 − 0 = 1 − Pi=1 n 2 yy yy yy i=1 yi
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Uncentered R 2 —One measure of the variability P 2 of 0the dependent variable is the sum of squares, yi = y y , note that y 0y
= (ˆ y + e)0 (ˆ y + e) = yˆ 0 yˆ + 2b 0 X 0 e + e 0 e = yˆ 0 yˆ + e 0 e 2 Ruc
(since X 0 e = 0 by the normal equations) Pn yˆi 2 y 0 Py e 0e yˆ 0 yˆ (R1) = 0 = 0 = 1 − 0 = 1 − Pi=1 n 2 yy yy yy i=1 yi
2 We have 0 ≤ Ruc ≤ 1, which has the interpretation of the fraction of the variation of the dependent variable that is attributable to the variation in the explanatory variables.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties (Centered) R 2 –the coefficient of determination—If the only regressor is a constant (K = 1 and xi = 1) then b = y¯ , yˆ 0 yˆ = n¯ y2
and e 0 e =
n X
(yi − y¯ )2
i=1
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties (Centered) R 2 –the coefficient of determination—If the only regressor is a constant (K = 1 and xi = 1) then b = y¯ , yˆ 0 yˆ = n¯ y2
and e 0 e =
n X
(yi − y¯ )2
i=1
more generally yi − y¯ = yˆi − y¯ + ei n X
(yi − y¯ )2
= y 0 M 0 y = y 0 M 0 Py + y 0 M 0 My
i=1
= y 0 PM 0 Py + y 0 My n n X X = (ˆ yi − y¯ )2 + ei2 i=1
Zhou Yahong
SHUFE
i=1
The Classical Multiple Regression Model
Algebraic Properties namely SST = SSR + SSE when the first column of X is i, M 0 M = M and M 0 P = PM 0
since X (X 0 X )−1 X i = i, i.e Pi = i and Mi = 0 Pn Pn (ˆ yi − y¯ )2 e2 R 2 = 1 − Pn i=1 i 2 = Pi=1 n ¯) ¯ )2 i=1 (yi − y i=1 (yi − y
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
(R2)
Algebraic Properties namely SST = SSR + SSE when the first column of X is i, M 0 M = M and M 0 P = PM 0
since X (X 0 X )−1 X i = i, i.e Pi = i and Mi = 0 Pn Pn (ˆ yi − y¯ )2 e2 R 2 = 1 − Pn i=1 i 2 = Pi=1 n ¯) ¯ )2 i=1 (yi − y i=1 (yi − y
(R2)
Thus, this R 2 is a measure of the explanatory power of the nonconstant regressors.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
Alternatively, R2 = =
e 0e y 0 My y 0 PM 0 Py = 1 − = 1 − yM 0 y yM 0 y yM 0 y yˆ 0 M 0 yˆ (ˆ y 0 M 0 yˆ )2 (ˆ y 0 M 0 y )2 = = = corr(y , yˆ ) yM 0 y yM 0 y (ˆ y 0 M 0 yˆ ) yM 0 y (ˆ y 0 M 0 yˆ )
as yˆ 0 M 0 yˆ = (y − e)0 M 0 yˆ = y 0 M 0 y .
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
(1)
Algebraic Properties
Alternatively, R2 = =
e 0e y 0 My y 0 PM 0 Py = 1 − = 1 − yM 0 y yM 0 y yM 0 y yˆ 0 M 0 yˆ (ˆ y 0 M 0 yˆ )2 (ˆ y 0 M 0 y )2 = = = corr(y , yˆ ) yM 0 y yM 0 y (ˆ y 0 M 0 yˆ ) yM 0 y (ˆ y 0 M 0 yˆ )
as yˆ 0 M 0 yˆ = (y − e)0 M 0 yˆ = y 0 M 0 y .
(1)
indicates the co-movement of y and yˆ around the mean, which can be illustrated graphically.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
If the regressors do not include a constant but you nevertheless calculate R 2 using (R2), then R 2 can be negative. This is because, without the benefit of an intercept, the regression could do worse than the sample mean in terms of tracking the dependent variable.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
2 and R 2 A relationship between Ruc
n¯ y2 2 )(1 − Run ) 2 (y − y ¯ ) i i=1
1 − R 2 = (1 + Pn
2 ≥ R 2 if we run two equivalent regressions, which indicates Ruc with or without the intercept term—–Four seasonal dummies 2 vs three dummies plus an intercept—STATA switches to Ruc when a constant is not included.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
2 and R 2 A relationship between Ruc
n¯ y2 2 )(1 − Run ) 2 (y − y ¯ ) i i=1
1 − R 2 = (1 + Pn
2 ≥ R 2 if we run two equivalent regressions, which indicates Ruc with or without the intercept term—–Four seasonal dummies 2 vs three dummies plus an intercept—STATA switches to Ruc when a constant is not included.
Two extreme cases: R 2 = 0, 1.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Some Comments on R 2 :
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Some Comments on R 2 : Intuitively, it would seem that some lines are a good fit of the data while other lines are not such a good fit.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Some Comments on R 2 : Intuitively, it would seem that some lines are a good fit of the data while other lines are not such a good fit. Intuitively, we start with the basic idea that our dependent variable varies across observations. Our model purports to explain part of this variation. For example, different people received different grades on their econometrics midterm examination. That is, the grades vary. We explain part of the variation in grades by noting that different people studied for different durations. We might ask: What fraction of the variation in grades can be explained by the hours studied. X X (Yi − Y¯ )2 versus (Yˆi − Y¯ )2
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
Notice that ordinary least squares maximizes the value of R 2 .
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
Notice that ordinary least squares maximizes the value of R 2 . One should not place too much importance on obtaining a high value. The R 2 can be influenced by factors such as the nature of the data in a way that indicates one should not use if as the sole judge of the quality of the econometric model.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
¯ 2 does not mean that the regressors are a true A high R 2 or R cause of the dependent variable.—imagine regressing test scores against parking lot area per pupil and (with a high R 2 , try telling the superintendent that the way to increase test scores is to increase parking space!)
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
¯ 2 does not mean that the regressors are a true A high R 2 or R cause of the dependent variable.—imagine regressing test scores against parking lot area per pupil and (with a high R 2 , try telling the superintendent that the way to increase test scores is to increase parking space!) In time-series studies, one often obtains high values of R 2 because any variable that grows over time is likely to do a good job of explaining the variation of any other variable that grows over time.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
Problems with R 2
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
Problems with R 2 R 2 will never decrease when another variable is added to a regression equation, so we define ¯ 2 = 1 − yMy /(n − K ) = 1 − n − 1 (1 − R 2 ) R yM 0 y /(n − 1) n−K
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
Problems with R 2 R 2 will never decrease when another variable is added to a regression equation, so we define ¯ 2 = 1 − yMy /(n − K ) = 1 − n − 1 (1 − R 2 ) R yM 0 y /(n − 1) n−K Constant term is needed in the regression for easy interpretation
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties
Comparison based on R 2 is not meaningful when different dependent variables are involved (linear v.s. log models)—two models Y = β1 + β2 X2 + β3 X3 + ε and Y − X2 = β10 + β20 X2 + β30 X3 + ε the same model, but different R 2 .
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Example
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Example model I Ct ˆt C t
= α1 + β1 Yt + ε1t = −2161 + 0.93Yt = (−5.75) (562)
Zhou Yahong
SHUFE
R 2 = 0.9995
The Classical Multiple Regression Model
Algebraic Properties Example model I Ct ˆt C t
= α1 + β1 Yt + ε1t = −2161 + 0.93Yt = (−5.75) (562)
R 2 = 0.9995
St Sˆt t
= α2 + β2 Yt + ε2t = 21.61 + 0.07Yt = (5.75) (44.37)
R 2 = 0.923
model II
note that the t-statistics are testing H0 : β = 0. Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties model III Ct ˆt C t
= α2 + β3 Yt + γ3 Ct−1 + ε3t = −.066 + 0.18Yt + 0.81Ct−1 = (−0.047) (8.58) (35.44)
Zhou Yahong
SHUFE
R 2 = 0.9999
The Classical Multiple Regression Model
Algebraic Properties model III Ct ˆt C t
= α2 + β3 Yt + γ3 Ct−1 + ε3t = −.066 + 0.18Yt + 0.81Ct−1 = (−0.047) (8.58) (35.44)
R 2 = 0.9999
Ct−1 is included to allow current consumption to depend on recent consumption behavior as well as income and the coefficient of the disposable income must be interpreted differently from model I.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties model III Ct ˆt C t
= α2 + β3 Yt + γ3 Ct−1 + ε3t = −.066 + 0.18Yt + 0.81Ct−1 = (−0.047) (8.58) (35.44)
R 2 = 0.9999
Ct−1 is included to allow current consumption to depend on recent consumption behavior as well as income and the coefficient of the disposable income must be interpreted differently from model I. long-run marginal propensity to consume is 0.18/(1 − 0.81), further note that a large portion of Yt is the permanent which would affect Ct−1 , in a sense, we are hold the permanent income constant, 0.18/(1 − 0.81) = 0.18 + 0.182 + ... The impact of the delayed effects. Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Notice that the word “significant” has a very specific statistical definition and does not mean “big.” It means that the estimate is far enough from zero that we are willing to conclude that the “true” slope coefficient is, in fact, not zero.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Notice that the word “significant” has a very specific statistical definition and does not mean “big.” It means that the estimate is far enough from zero that we are willing to conclude that the “true” slope coefficient is, in fact, not zero. The general principal—model simplicity vs good-of-fitness.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Algebraic Properties Notice that the word “significant” has a very specific statistical definition and does not mean “big.” It means that the estimate is far enough from zero that we are willing to conclude that the “true” slope coefficient is, in fact, not zero. The general principal—model simplicity vs good-of-fitness. Amemiya’s prediction criterion PCj = based on
ej0 ej n − Kj
(1 +
Kj ) n
˜ j2 = 1 − n + Kj (1 − Rj2 ) R n − Kj
with the notion that the adjusted R-squared does not penalize the loss of degrees of freedom heavily enough. Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Statistical Properties Statistical Properties: b = (X 0 X )−1 X 0 y = β + (X 0 X )−1 X 0 ε thus Eb = b and
Var(b) = σ 2 (X 0 X )−1
b is linear and unbiased
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Statistical Properties Statistical Properties: b = (X 0 X )−1 X 0 y = β + (X 0 X )−1 X 0 ε thus Eb = b and
Var(b) = σ 2 (X 0 X )−1
b is linear and unbiased In a two-variable model 2
ˆ =P σ var (β) (xi − x¯)2
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Statistical Properties Statistical Properties: b = (X 0 X )−1 X 0 y = β + (X 0 X )−1 X 0 ε thus Eb = b and
Var(b) = σ 2 (X 0 X )−1
b is linear and unbiased In a two-variable model 2
ˆ =P σ var (β) (xi − x¯)2 Gauss-Markov Theorem: b is BLUE
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Statistical Properties Statistical Properties: b = (X 0 X )−1 X 0 y = β + (X 0 X )−1 X 0 ε thus Eb = b and
Var(b) = σ 2 (X 0 X )−1
b is linear and unbiased In a two-variable model 2
ˆ =P σ var (β) (xi − x¯)2 Gauss-Markov Theorem: b is BLUE ¯. Can viewed as the extention of µ = EX by X
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Statistical Properties
if b0 = Cy is also unbiased, then Var(b0 ) =Var(b) + G , with G being nonnegative definite. assuming
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Statistical Properties
if b0 = Cy is also unbiased, then Var(b0 ) =Var(b) + G , with G being nonnegative definite. assuming since C 0X = I by the unbiasedness, let D = C − (X 0 X )−1 X 0 so Dy = b0 − b and DX = 0, Vb0
= σ 2 CC 0 = σ 2 [(X 0 X )−1 X 0 + D][(X 0 X )−1 X 0 + D]0 = σ 2 (X 0 X )−1 + σ 2 DD 0
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Statistical Properties
Normality and the distribution of b
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Statistical Properties
Normality and the distribution of b b ∼ N(β, σ 2 (X 0 X )−1 ) and bk ∼ N(β, σ 2 (X 0 X )−1 kk ) b is the best unbiased estimator
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Statistical Properties
Estimating σ 2 and the variance of b e = My = M(X β + ε) = Mε e 0 e = ε0 Mε
and Ee 0 e
= trace(E (ε0 Mε)) = trace(E (Mεε0 )) = trace(E (σ 2 MI ))
but trace(M) = tr(In − X (X 0 X )−1 X 0 ) = n − K
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Statistical Properties
therefore E
e 0e = σ2 n−K
and an unbiased estimator for σ 2 is s2 =
e 0e n−K
and est. Var[b] = s 2 (X 0 X )−1
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing We cover some classical tests in this subsection–two equivalent approaches. F , t-Tests
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing We cover some classical tests in this subsection–two equivalent approaches. F , t-Tests Wald principle—estimation first, and then checking the null hypothesis directly.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing We cover some classical tests in this subsection–two equivalent approaches. F , t-Tests Wald principle—estimation first, and then checking the null hypothesis directly. Testing a hypothesis about a coefficient
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing We cover some classical tests in this subsection–two equivalent approaches. F , t-Tests Wald principle—estimation first, and then checking the null hypothesis directly. Testing a hypothesis about a coefficient Let S kk denote the kth diagonal element of (X 0 X )−1 , then bk − βk ∼ N(0, 1) zk = √ σ 2 S kk and ε 0 ε n − K 2 e 0e s = = M ∼ χ2 (n − K ) σ2 σ2 σ σ n − K = rank(M)=tr(M) Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing
Independence of b and s 2 Cov(b, e) = E (X 0 X )−1 X 0 εε0 M = 0 and b,e are normally distributed and independent of each other as b = β + (X 0 X )−1 X 0 ε, e = Mε, and MX = 0.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing
Independence of b and s 2 Cov(b, e) = E (X 0 X )−1 X 0 εε0 M = 0 and b,e are normally distributed and independent of each other as b = β + (X 0 X )−1 X 0 ε, e = Mε, and MX = 0. t-test √ (bk − βk )/ σ 2 S kk
(bk − βk ) tk = p = √ ∼ t(n−K ) [(n − K )s 2 /σ 2 ]/(n − K ) s 2 S kk
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing
Test a linear restriction r 0 β = c0 where r is a known vector and c0 is a known scalar. Since r 0 b ∼ N(r 0 β, σ 2 r 0 (X 0 X )−1 r ) and
r 0 b − c0 p ∼ t(n−K ) σ ˆ r 0 (X 0 X )−1 r
under the null hypothesis
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing Test a set of linear restrictions—null hypothesis Rβ = q where q is J × 1, then R 0 b ∼ N(Rβ, σ 2 R(X 0 X )−1 R 0 )
(Rb − q)0 [σ 2 R(X 0 X )−1 R 0 ]−1 (Rb − q) ∼ χ2 (J) and
e 0e ∼ χ2 (n − K ) σ2
and b is independent of
Zhou Yahong
e 0e . σ2 SHUFE
The Classical Multiple Regression Model
Hypothesis Testing
Thus (Rb − q)0 [σ 2 R(X 0 X )−1 R 0 ]−1 (Rb − q)/J e0e /(n − K ) σ2 = (Rb − q)0 [ˆ σ 2 R(X 0 X )−1 R 0 ]−1 (Rb − q)/J ∼ FJ,(n−K )
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing Restricted least squares vs. the unrestricted model
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing Restricted least squares vs. the unrestricted model Unrestricted model y = Xβ + ε min S(β) = (y − X β)0 (y − X β) β
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing Restricted least squares vs. the unrestricted model Unrestricted model y = Xβ + ε min S(β) = (y − X β)0 (y − X β) β
Restricted model y = Xβ + ε
and
min S(β) = (y − X β)0 (y − X β) β
Rβ = q
subject to Rβ = q
can be solved by the method of Lagrangian multiplier through the Lagrangian (y − X β)0 (y − X β) + λ0 (Rβ − q)
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing The restricted OLS b∗ = b − (X 0 X )−1 R 0 [R(X 0 X )−1 R 0 ]−1 [Rb − q] with Var(b∗ ) = σ 2 [(X 0 X )−1 −(X 0 X )−1 R 0 [R 0 (X 0 X )−1 R]−1 R(X 0 X )−1 ]
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing The restricted OLS b∗ = b − (X 0 X )−1 R 0 [R(X 0 X )−1 R 0 ]−1 [Rb − q] with Var(b∗ ) = σ 2 [(X 0 X )−1 −(X 0 X )−1 R 0 [R 0 (X 0 X )−1 R]−1 R(X 0 X )−1 ] Let e∗ = y − Xb∗ = e − X (b∗ − b) and e∗0 e∗ = e 0 e + (b∗ − b)0 X 0 X (b∗ − b) e∗0 e∗ − e 0 e = (Rb − q)0 [R(X 0 X )−1 R 0 ]−1 (Rb − q) Thus F (J, n − K ) = =
Zhou Yahong
SHUFE
(e∗0 e∗ − e 0 e)/J e 0 e/(n − K ) (R 2 − R∗2 )/J (1 − R 2 )/(n − K )
The Classical Multiple Regression Model
Hypothesis Testing
From here, we can also see that R 2 increases when more variables are added.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing
From here, we can also see that R 2 increases when more variables are added. For testing the significance of the regression F =
R 2 /J (1 − R 2 )/(n − K )
since R∗2 = 0.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing Examples in estimating production function
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing Examples in estimating production function Labor Elasticity equal to 1 (testing a single coefficient)
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing Examples in estimating production function Labor Elasticity equal to 1 (testing a single coefficient) Constant returns to scale (testing two coefficients summing to 1)
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing Examples in estimating production function Labor Elasticity equal to 1 (testing a single coefficient) Constant returns to scale (testing two coefficients summing to 1) Translog production function 1 1 ln Y = β1 +β2 ln L+β3 ln K +β4 ( ln2 L)+β5 ( ln2 K )+β6 ln L ln K +ε 2 2 which relaxes the unitary elasticity of substitution. The Cobb-Douglas model is obtained by the restriction β3 = β4 = β5 = 0.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing Examples in estimating production function Labor Elasticity equal to 1 (testing a single coefficient) Constant returns to scale (testing two coefficients summing to 1) Translog production function 1 1 ln Y = β1 +β2 ln L+β3 ln K +β4 ( ln2 L)+β5 ( ln2 K )+β6 ln L ln K +ε 2 2 which relaxes the unitary elasticity of substitution. The Cobb-Douglas model is obtained by the restriction β3 = β4 = β5 = 0. Table 6.2.where F [3, 21] = 1.768, with the critical value for the F statistic, we would not reject the hypothesis that a Cobb-Douglas model is appropriate, but the coefficients differ greatly, why? Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Hypothesis Testing
The answer lies in the interpretation of the coefficients—–namely, the marginal effects for each unit change of x, in the translog model ∂ ln Y = β3 + β5 ln K + β6 ln L ∂ ln K at the mean of ln K and ln L (7.4459 and 5.7637) ∂ ln Y ∂ ln K = 0.5425.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Examples—before and after WTO (trade figures), company performance before and after being listed, consumption function during (rationing) and after the war, production (and consumption pattern) before and after oil shock, coastal vs. interior region
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Case I: Different coefficients for different subsets of the data (different periods or region)
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Case I: Different coefficients for different subsets of the data (different periods or region) Restriction–all the coefficients are the same
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Case I: Different coefficients for different subsets of the data (different periods or region) Restriction–all the coefficients are the same Null hypothesis—The same coefficients are the same
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Case I: Different coefficients for different subsets of the data (different periods or region) Restriction–all the coefficients are the same Null hypothesis—The same coefficients are the same Alternative—all the coefficients can be different
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
The unrestricted model y1 X1 β1 ε1 = + y2 X2 β2 ε2
Unrestriced Model Same slopes
and e 0 e = e10 e1 + e20 e2
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
The unrestricted model y1 X1 β1 ε1 = + y2 X2 β2 ε2
Unrestriced Model Same slopes
and e 0 e = e10 e1 + e20 e2 The restricted model y1 X1 ε1 = β+ y2 X2 ε2
Restriced Model Same slopes
with the restriction β1 = β2 , where R = [I , I ] and q = 0. J = K.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Gasoline Consumption Equations.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Gasoline Consumption Equations. K = 6, n = 36, using F (J, n − 2k) = F (6, 24).
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Gasoline Consumption Equations. K = 6, n = 36, using F (J, n − 2k) = F (6, 24). Rejected—so coefficients have changed—then what’s next—at least some has changed
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same Restriction—the slope coefficients are the same
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same Restriction—the slope coefficients are the same Null hypothesis—change in the constant, but not the slopes, or the slopes remained the same
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same Restriction—the slope coefficients are the same Null hypothesis—change in the constant, but not the slopes, or the slopes remained the same Alternatives—both the constant or slopes can be different
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same Restriction—the slope coefficients are the same Null hypothesis—change in the constant, but not the slopes, or the slopes remained the same Alternatives—both the constant or slopes can be different constant terms are different,but the slopes are same i Wpre73 Differential intercepts XU = i Wpost73 Differential slopes and XR =
i i
Wpre73 Wpost73
Differential intercepts Common slopes
J=5
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change Case II: Change only in the constant—but the slopes remained the same Restriction—the slope coefficients are the same Null hypothesis—change in the constant, but not the slopes, or the slopes remained the same Alternatives—both the constant or slopes can be different constant terms are different,but the slopes are same i Wpre73 Differential intercepts XU = i Wpost73 Differential slopes and XR =
i i
Wpre73 Wpost73
Differential intercepts Common slopes
J=5 Rejected—two periods are systematically different–beyond a simple shift in the constant term Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Case III: Change in a subset of coefficients—–constant, the price an income elasticities—J = 3,
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Case III: Change in a subset of coefficients—–constant, the price an income elasticities—J = 3, Restriction—Some Coefficient remain the same. XU =
i
Wpre73 i
Wpost73
Differential intercepts Differential slopes
and XR =
i
Zpre73 i
Zpost73
Wpre73 Wpost73
Some Common slopes
where Z denote the gasoline price and income variables, whose coefficients are believed to have changed.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
For all three cases, the same Unrestricted model
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
For all three cases, the same Unrestricted model R1: all the coefficients are the same;
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
For all three cases, the same Unrestricted model R1: all the coefficients are the same; R2: all the slope coefficients are the same;
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
For all three cases, the same Unrestricted model R1: all the coefficients are the same; R2: all the slope coefficients are the same; R3: only a subset of the slope coefficients are the same.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change The Chow’s Predictive test
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change The Chow’s Predictive test Instead of using all the sample observations for estimation, the suggested procedure is to divide the data set of n sample observations into n1 observations to be used for estimation and n2 = n − n1 observations for testing. There are no hard and fast rules for determining the relative sizes of n1 and n2 . It is not uncommon to reserve 5, 10 or 15 percent of the observations for testing.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change The Chow’s Predictive test Instead of using all the sample observations for estimation, the suggested procedure is to divide the data set of n sample observations into n1 observations to be used for estimation and n2 = n − n1 observations for testing. There are no hard and fast rules for determining the relative sizes of n1 and n2 . It is not uncommon to reserve 5, 10 or 15 percent of the observations for testing. OLS for the first subsample b1 = (X10 X1 )−1 X10 y
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change The Chow’s Predictive test Instead of using all the sample observations for estimation, the suggested procedure is to divide the data set of n sample observations into n1 observations to be used for estimation and n2 = n − n1 observations for testing. There are no hard and fast rules for determining the relative sizes of n1 and n2 . It is not uncommon to reserve 5, 10 or 15 percent of the observations for testing. OLS for the first subsample b1 = (X10 X1 )−1 X10 y a prediction for y2 yˆ2 = X2 b1
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
The prediction error d = y2 − yˆ2 = y2 − X2 b1 = ε2 − X2 (b1 − β) with var(d) = σ 2 [In2 + X2 (X1 X10 )−1 X20 ]
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
The prediction error d = y2 − yˆ2 = y2 − X2 b1 = ε2 − X2 (b1 − β) with var(d) = σ 2 [In2 + X2 (X1 X10 )−1 X20 ] Thus d 0 [var(d)]−1 d ∼ χ2 (n2 ) Further e10 e1 /σ 2 ∼ χ2 (n1 − K )
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Under the null hypothesis F =
d 0 [In2 + X2 (X1 X10 )−1 X20 ]−1 d/n2 ∼ F (n2 , n1 − K ) e10 e1 /(n1 − K )
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Under the null hypothesis F =
d 0 [In2 + X2 (X1 X10 )−1 X20 ]−1 d/n2 ∼ F (n2 , n1 − K ) e10 e1 /(n1 − K )
Equivalently F =
(e∗0 e∗ − e10 e1 )/n2 e10 e1 /(n1 − K )
where e∗ is the residual vector for the restricted model—when there is no structural change.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change
Another case for applying this approach—In the case of demand for gasoline—one possibility would be that consumers took a year or two to adjust to the turmoil of the two oil price shocks in 1973 and 1979, but that the market never actually fundamentally changed or that it only changed temporarily. We might consider the same test as before, but now only single out the four years 1974, 1975, 1980 and 1981 for special treatment.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change Prediction
0
y 0 = x 0 β + ε0 to predict the value of y 0 associated with a regressor vector x 0 , which is 0 yˆ 0 = x 0 b 0
e 0 = y 0 − yˆ 0 = x 0 (β − b) + ε0 0
Var[e 0 ] = σ 2 + σ 2 x 0 [X 0 X ]−1 x 0 Var[e 0 ] = σ 2 [1 +
K −1 K −1 1 XX 0 + (xj − x¯j )(xk0 − x¯k )(Z 0 M 0 Z )jk ] n j=1 k=1
where Z is the K − 1 columns of X not including the constant.
Zhou Yahong
SHUFE
The Classical Multiple Regression Model
Tests of Structural Change Prediction
0
y 0 = x 0 β + ε0 to predict the value of y 0 associated with a regressor vector x 0 , which is 0 yˆ 0 = x 0 b 0
e 0 = y 0 − yˆ 0 = x 0 (β − b) + ε0 0
Var[e 0 ] = σ 2 + σ 2 x 0 [X 0 X ]−1 x 0 Var[e 0 ] = σ 2 [1 +
K −1 K −1 1 XX 0 + (xj − x¯j )(xk0 − x¯k )(Z 0 M 0 Z )jk ] n j=1 k=1
where Z is the K − 1 columns of X not including the constant. forecast interval yˆ 0 ± tλ/2 se(e 0 ) the picture of the intervals whose widths depends on x 0 Zhou Yahong
SHUFE
The Classical Multiple Regression Model