WH Hong Multiple Regression Analysis: Inference 1. Sampling Distribution of the OLS Estimators

ECON 482 / WH Hong Multiple Regression Analysis: Inference 1. Sampling Distribution of the OLS Estimators Statistical inference in the regression mo...
11 downloads 0 Views 290KB Size
ECON 482 / WH Hong

Multiple Regression Analysis: Inference

1. Sampling Distribution of the OLS Estimators Statistical inference in the regression model ¾ Hypothesis tests about population parameters ¾ Construction of confidence intervals

Sampling distribution of the OLS estimators ¾ The OLS estimators are random variables ¾ We already know their expected values and their variances ¾ However, for hypothesis tests we need to know their distribution. ¾ In order to derive their distribution we need additional assumptions ¾ Assumption about distribution of errors: Normal distribution =>

1

Assumption MRL. 6

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

Additional Assumption MLR. 6

Normality of error terms

The population error ui is independent of the explanatory variables xi1 , xi 2 ,..., xik and is normally distributed with zero mean and variance σ 2 . That is, ui ~ N ( 0,σ 2 )

independently of xi1 , xi 2 ,..., xik

Terminology MLR.1 - MLR. 5: Gauss-Markov assumptions MLR.1 - MLR. 6: Gauss-Markov assumptions + Normality =

Classical Linear Model (CLM) assumptions

Under CLM assumption, we can summarize the population assumptions as: y x ~ Normal ( β 0 + β1 x1 + ... + β k xk , σ 2 )

2

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

Graphically,

Note that Assumption MRL. 6 is much stronger than any of our previous assumptions. In fact, if we make Assumption MRL. 6, we are necessarily assuming MLR. 4 (Zero conditional mean) and MLR. 5 (Homoskedasticity).

3

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

Discussion of the normality assumption • The error term is the sum of "many" different unobserved factors • Sums of independent factors are normally distributed by the Central Limit Theorem (CLT). • Problem ¾ CLT only works when all unobserved factors affect y in a separate additive fashion. But nothing guarantees that this is so. => may be a complicated functional form ¾ Possibly very heterogeneous distributions of individual factors ¾ How independent are the different factors? • The normality of the error term is an empirical question. In many cases, normality is questionable or impossible by definition • Examples where normality cannot hold: ¾ Wage (nonnegative); Unemployment (indicator variable, takes on only 1 or 0), etc • In some cases, normality can be achieved through transformations of the dependent variable (e.g. use log(wage) instead of wage) • Under normality, OLS is BLUE. • Important: For the purpose of statistical inference, the assumption of normality can be replaced by a large sample size. 4

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

Normal sampling distributions • Normality of the error term translates into normal sampling distributions of the OLS estimators.

Theorem 4.1 Normal sampling distribution Under the CLM assumption from MLR. 1 through MLR. 6,

(

( ))

βˆ j ~ Normal β j , var βˆ j where

( )

var βˆ j =

σ2

SST j (1 − R 2j )

SST j = ∑ ( x j − x j ) n

,

2

and

i =1

R j2 is the R-squared from the regression of x j on all other independent variables

Therefore, the standardized estimators follow:

( βˆ

) ~ Normal ( 0, 1) s.d .( βˆ ) j

−βj j

5

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

Sketch of the proof.

• The proof uses a property of normality: ¾ Any linear combination of normally-distributed random variables is also normally distributed. • Note that the OLS estimators are a linear combination of errors: n

¾ βˆ j = β j + ∑ wij ui where wij = rˆij / SSR j , rˆij is the i-th residual from the regression of i =1

n

the x j on all other independent variables, and SSR j = ∑ rˆij2 i =1

• Therefore, βˆ j is also normally-distributed.

Note that any linear combination of βˆ1 ,..., βˆk is also normally distributed since each β j is normally distributed.

6

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

2. Testing Hypotheses about a Single Population Parameter: The t Test (1)

Test statistics

Theorem 4.2

t-distribution for the standardized estimator

Under the CLM assumptions MLR. 1 through MLR. 6,

βˆ j − β j

( )

s.d . βˆ j

~ tn−k −1 = tdf

where k + 1 is the number of unknown parameters, and n − k − 1 is the degrees of freedom (df).

Note: The t-distribution is close to the standard normal distribution if n − k − 1 is large

7

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

Constructing the test statistics and hypothesis testing H0 : β j = 0

• Null hypothesis:

¾ The population parameter is equal to zero, i.e., after controlling for the other independent variables, x j have no effect on the expected value of y . • Under this null hypothesis, we can construct the t-statistics (or t-ratio) as: ¾ tβˆ = j

βˆ j

( )

s.d . βˆ j

¾ The farther the estimated coefficient is away from zero, the less likely it is that the null hypothesis hold true. • Distribution of the t-statistics if the null hypothesis is true ¾ tβˆ = j

• Goal:

βˆ j

( )

s.d . βˆ j

=

βˆ j − β j

( )

s.d . βˆ j

~ tn−k −1

Define a rejection rule so that, if it is true, H 0 is rejected only with a small

probability (= significance level, e.g. 5%)

8

ECON 482 / WH Hong

(2)

Multiple Regression Analysis: Inference

Testing against one-sided alternatives • Hypothesis (greater than zero) H0 : β j = 0

H1 : β j > 0

against

• Construct the t-statistics under the null hypothesis • Decide a rejection rule, i.e., define the significance level (10%, 5%, or 1%). And construct the critical value, c . • The null is rejected if

tβˆ > c0.05 = tn−k −1,0.05 ; e.g. c0.05 = t28,0.05 = 1.701 j

• Graphically,

9

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

(Example) Wage equation • Estimated equation m ( wage ) = 0.284 + 0.092 educ + 0.0041 exper + 0.022 tenure log

(0.104) (0.007)

n = 526 ,

(0.0017)

(0.003)

R 2 = 0.316 , and standard errors are in the parentheses

• Test whether, after controlling for education and tenure, higher work experience leads to higher hourly wages • Hypothesis:

H 0 : β exper = 0

• t-statistics:

tβˆ

exper

=

H1 : β exper > 0

vs.

0.0041 ≈ 2.41 0.0017

• Critical values: df = n − k − 1 = 526 − 3 − 1 = 522

At the 5 % significance level, c0.05 = t522,0.05 = 1.645 ; At the 1 % significance level, c0.01 = t522,0.01 = 2.326 • Since tβˆ

exper

> c0.05 , the effect of experience on hourly wage is statistically greater than zero at

the 5 % (even at the 1%) significance level. 10

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

Similarly, we can test the following hypothesis (less than zero) H0 : β j = 0

against

H1 : β j < 0

• Construct the t-statistics under the null hypothesis • Decide a rejection rule, i.e., define the significance level (10%, 5%, or 1%). And construct the critical value, c . • The null is rejected if

tβˆ < c0.05 = tn−k −1,0.05 ; e.g. c0.05 = t18,0.05 = −1.734 j

• Graphically,

11

ECON 482 / WH Hong

(3)

Multiple Regression Analysis: Inference

Testing against two-sided alternatives

• Hypothesis H0 : β j = 0

against

H1 : β j ≠ 0

• Construct the t-statistics under the null hypothesis • Decide a rejection rule, i.e., define the significance level (10%, 5%, or 1%). And construct the critical value, c . • The null is rejected if

tβˆ > c0.05 = tn−k −1,0.025 ; j

• Graphically,

12

e.g. c0.05 = t25,0.025 = 2.06

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

(Example) Determinants of college GPA • Estimated equation n = 1.39 + 0.412 hsGPS + 0.015 ACT − 0.083 skipped collGPA

(0.33) (0.094)

(0.011)

(0.026)

skipped : lectures missed per week

n = 141 ,

R 2 = 0.234

• thsGPA = 4.38 > c0.01 = 2.58 t ACT = 1.36 < c0.10 = 1.645 tskipped = −3.19 > c0.01 = 2.58

• The effects of hsGPA and skipped are significantly different from zero at the 1 % significance level. The effect of ACT is not significantly different from zero, not even at the 10 % significance level.

13

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

‹ "Statistically significant" variables in a regression ¾ If a regression coefficient is different from zero in a two-sided test, the corresponding

variable is said to be "statistically significant" ¾ If the number of degrees of freedom is large enough so that the normal approximation

applies, the following rules of thumb apply: t − ratio > 1.645

=> "statistically significant at the 10 % level"

t − ratio > 1.96

=> "statistically significant at the 5 % level"

t − ratio > 2.576

=> "statistically significant at the 1 % level"

14

ECON 482 / WH Hong

(4)

Multiple Regression Analysis: Inference

Testing other hypothesis: general case

• Hypothesis H0 : β j = α j

against

H1 : β j ≠ α j

where α j is a hypothesized value of the coefficient • t-statistics ˆ estimate − hypothesized value ) ( β j − α j ) ( = t= standard error

• critical value:

( )

s.d . βˆ j

c0.05 = tn−k −1,0.025

• Reject the null if

t > c0.05

• The test works exactly as before, except that the hypothesized value is subtracted from the estimate when forming the statistic

15

ECON 482 / WH Hong

(Example)

Multiple Regression Analysis: Inference

Campus crime and enrollment

• An interesting hypothesis is whether crime increases by one percent if enrollment is increased by one percent • Estimated equation m ( crime ) = −6.63 + 1.27 log ( enroll ) log

(1.03) (0.11)

n = 97 ,

R 2 = 0.585

• Hypothesis: • t=

H 0 : β log( enroll ) = 1 vs.

H1 : β log( enroll ) ≠ 1

(1.27 − 1) ≈ 2.45 > 1.96 = c 0.11

0.05

16

ECON 482 / WH Hong

(5)

Multiple Regression Analysis: Inference

Computing p-values for t-test

• If the significance level is made smaller and smaller, there will be a point where the null hypothesis cannot be rejected anymore • The smallest significance level at which the null hypothesis is rejected, is called the p-value of the hypothesis test • p-value= P ( T > t − ratio )

where

T is a t distributed random variable.

• A small p-value is evidence against the null hypothesis. • For example, when the t-statistic is 1.85 with df = 40, we can calculate the p-value as: P ( T > 1.85 ) = 2 ( 0.359 ) = 0.0718

(need to use a computer program.)

17

ECON 482 / WH Hong

(6)

Multiple Regression Analysis: Inference

Confidence intervals

• Simple manipulation of the result in Theorem 4.2 implies that

(

( ) ) = 0.95

( )

P βˆ j − c0.05 ⋅ s.d . βˆ j ≤ β j ≤ βˆ j + c0.05 ⋅ s.d . βˆ j

where

c0.05 is the critical value of the sided test

• Interpretation of the confidence interval ¾ The bound of the interval are random ¾ In repeated samples, the interval that is constructed in the above way will cover the

population regression coefficient in 95% of the cases • Confidence intervals for typical confidence levels

( P ( βˆ − c P ( βˆ − c

( )

( ) ) = 0.99

( )

( ) ) = 0.95 ⋅ s.d .( βˆ ) ) = 0.90

P βˆ j − c0.01 ⋅ s.d . βˆ j ≤ β j ≤ βˆ j + c0.01 ⋅ s.d . βˆ j j

0.05

⋅ s.d . βˆ j ≤ β j ≤ βˆ j + c0.05 ⋅ s.d . βˆ j

j

0.10

⋅ s.d . βˆ j ≤ β j ≤ βˆ j + c0.10

( )

j

As a rule of thumb, we can use c0.01 = 2.576 , c0.05 = 1.96 and c0.10 = 1.645

18

ECON 482 / WH Hong

(Example)

Multiple Regression Analysis: Inference

Model of firms' R&D expenditure

• Estimated equation m ( rd ) = −4.38 + 1.084 log ( sales ) + 0.0218 profmarg log

(0.47) (0.060)

n = 32 ,

R 2 = 0.918 ,

• Critical value:

(0.0217)

profmarg : Profits as percentage of sales

c0.05 = t29,0.025 = 2.045

• C.I. of log ( sales ) :

1.084 ± 2.045 ( 0.060 ) = [ 0.961, 1.21]

¾ The effect of sales on R&D is relatively precisely estimated as the interval is narrow. ¾ Moreover, the effect is significantly different from zero because zero is outside the

interval • C.I. of profmarg :

0.0217 ± 2.045 ( 0.0218 ) = [ −0.0045, 0.0479]

¾ The effect is imprecisely estimated as the interval is very wide ¾ It is not even statistically significant because zero lies in the interval

19

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

3. Testing Hypotheses about a Single Linear Combination of the Parameters (Example) Return to education at 2 year vs. at 4 year colleges • log ( wage ) = β 0 + β1 jc + β 2univ + β3exper + u jc :

years of education at 2 year colleges

univ :

years of education at 4 year colleges

exper :

work experience

• Test hypothesis:

H 0 : β1 = β 2 or β1 − β 2 = 0

vs.

H1 : β1 − β 2 < 0

(i) A possible test statistic • t-statistic:

t=

βˆ1 − βˆ2

(

s.d . βˆ1 − βˆ2

)

, Hence, reject the null if t < c0.05 = tn−k −1,0.05

• However, it is impossible with standard regression output because

( ) ( ) m βˆ , βˆ is unknown. cov ( )

( )

( )

(

m βˆ − βˆ = var m βˆ + var m βˆ − 2cov m βˆ , βˆ s.d . βˆ1 − βˆ2 = var 1 2 1 2 1 2 1

2

20

)

and

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

(ii) Alternative method • Define θ1 = β1 − β 2 and test

H 0 : θ1 = 0

against

H1 : θ1 < 0

log ( wage ) = β 0 + (θ1 + β 2 ) jc + β 2univ + β 3exper + u

= β 0 + θ1 jc + β 2 ( jc + univ ) + β 3exper + u

• Estimated equation m ( wage ) = 1.472 − 0.0102 jc + 0.0769totcoll + 0.0049exper log (0.021) (0.0069) n = 6,763 ,

• t=

R 2 = 0.222 ,

(0.0023)

(0.0002)

totcoll = jc + univ

−0.0102 = −1.48 0.0069

p − value = P (T < −1.48 ) = 0.70

C.I.

−0.0102 ± 1.96 ( 0.0069 ) = [ −0.0237, 0.0003]

=> The null hypothesis is rejected at the 10% level but not at the 5% level • This method works always for single linear hypothesis

21

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

4. Testing Multiple Linear Restrictions: The F test (1)

Testing exclusion restriction

(Example)

MLB players' salaries

• log ( salary ) = β 0 + β1 years + β 2 gamesyr + β3bavg + β 4 hrunsyr + β5 rbisyr + u years :

years in the league

gamesyr :

average number of games per year

bavg :

batting average

hrunsyr :

home runs per year

rbisyr :

runs batted in per year

• Hypothesis:

H 0 : β3 = 0, β 4 = 0, β5 = 0

vs.

H1 : H 0 is not true

¾ test whether performance measures have no effect on salary / can be excluded from

regression. ¾ The alternative hypothesis can be read as "at least one performance measure have an

effect on salary."

22

ECON 482 / WH Hong

(Example)

Multiple Regression Analysis: Inference

MLB players' salaries (Cont'd)

• Estimation of the unrestricted model m ( salary ) = 11.19 + 0.0689 years + 0.0126 gamesyr log (0.29) (0.0121) (0.0026) + 0.00098bavg + 0.0144hrunsyr + 0.0108rbisyr (0.00110) (0.0161) (0.0072)

n = 353 , SSRur = 183.186 , Rur2 = 0.6278 ¾ None of the performance measures is statistically significant when tested individually

using t-statistic. • Estimation of the restricted model ( β 3 = 0, β 4 = 0, β 5 = 0 ) m ( salary ) = 11.22 + 0.0713 years + 0.0202 gamesyr log

(0.11) (0.0125)

(0.0013)

n = 353 , SSRr = 198.311 , Rr2 = 0.5971 ¾ The sum of squared residuals necessarily increases because possible relevant variables

are dropped in the restricted regression. => Is the increase statistically significant? 23

ECON 482 / WH Hong

(Example)

Multiple Regression Analysis: Inference

MLB players' salaries (Cont'd)

• Test statistic F=

( SSRr − SSRur ) / q ~ F q ,n −k −1 SSRur / ( n − k − 1)

q = numerator df = # of restictions = df r − dfur

and

n − k − 1 = denominator df = dfur

¾ No derivation, but the definition of an F distributed random variable is F =

• Rejection rule: Reject the null if

F > c0.05 = Fq ,n−k −1,0.05

¾ A F-distributed variable only takes on positive values. 24

χ q2 χ n2−k −1

.

ECON 482 / WH Hong

(Example)

Multiple Regression Analysis: Inference

MLB players' salaries (Cont'd)

• Test decision F=

(198.311 − 183.186 ) / 3 ≈ 9.55 183.186 / ( 353 − 5 − 1)

and

c0.05 = F3,347,0.01 = 3.78

p − value = P ( F > 9.55 ) = 0.000

Thus, the null hypothesis is overwhelmingly rejected (even very small significance level).

• Discussion ¾ They were not significant when tested individually ¾ But, the three variables are "jointly significant" ¾ The likely reason is multicollinearity between them

25

ECON 482 / WH Hong

Multiple Regression Analysis: Inference

• The R-squared form of the F statistic ¾

Rur2 − Rr2 ) / q ( SSRr − SSRur ) / q ( F= = SSRur / ( n − k − 1) (1 − Rur2 ) / ( n − k − 1)

¾ The proof can be easily done by using SSRr = SST (1 − Rr2 ) and SSRur = SST (1 − Rur2 ) ¾ Note that the R-squared form of the test is only valid for exclusion restriction. ¾ Since Rr2 = 0.5971 Rur2 = 0.6278 , F=

(R

− Rr2 ) / q

( 0.6278 − 0.5971) / 3 ≈ 9.54 (1 − Rur2 ) / ( n − k − 1) (1 − 0.6278) / 347 2 ur

=

- The difference in the last decimal digit is due to rounding error

26

ECON 482 / WH Hong

(2)

Multiple Regression Analysis: Inference

Test of overall significance of a regression

• Unrestricted model yi = β 0 + β1 xi1 + β 2 xi 2 + ... + β k xik + ui

• Hypothesis:

H 0 : β1 = ... = β k = 0

vs.

H1 : H1 is not true

¾ The null hypothesis states that the explanatory variables are not useful at all in explaining

the dependent variable ¾

• Restricted model (regression on constant) yi = β 0 + ui

• Test statistic SSRr − SSRur ) / q Rur2 / q ( F= = ~ Fk ,n−k −1 SSRur / ( n − k − 1) (1 − Rur2 ) / ( n − k − 1)

¾ Note that Rr2 = 0 .

27

ECON 482 / WH Hong

(3)

Multiple Regression Analysis: Inference

Testing general linear restrictions with the F-test

(Example)

Test whether house price assessments are rational

• Estimation model: log ( price ) = β 0 + β1 log ( assess ) + β 2 log ( lotsize ) + β 3 log ( sqrft ) + β 4bdrms + u price :

actual house price

lotsize :

size of lots (in feet)

bdrms :

number of bedrooms

• Hypothesis:

assess : sqrft :

the assessed housing value (before sold) square feet

H 0 : β1 = 1, β 2 = 0, β 3 = 0, β 4 = 0

vs.

H1 : H 0 is not true

¾ If house price assessments are rational, a 1% change in the assessment should be

associated with a 1% change in price. ¾ In addition, other known factors should not influence the price once the assessed value

has been controlled for.

28

ECON 482 / WH Hong

(Example)

Multiple Regression Analysis: Inference

Test whether house price assessments are rational (cont'd)

• Unrestricted regression y = β 0 + β1 x1 + β 2 x2 + β3 x3 + β 4 x4 + u ,

SSRur = 1.822

• Restricted regression

[ y − x1 ] = β0 + u ,

y = β 0 + x1 + u

=>

¾ Regression of

[ y − x1 ]

SSRr = 1.880

on a constant

• Test statistics F=

( SSRr − SSRur ) / q = (1.880 − 1.822 ) / 4 ≈ 0.661 SSRur / ( n − k − 1) 1.822 / ( 88 − 4 − 1)

F ~ F4,83,0.05 = c0.05 = 2.50

¾ H 0 cannot be rejected since

F = 0.661 < c0.05 = 2.50

• Note that the R-squared form of the test statistic CANNOT be applied here.

29

Suggest Documents