## Multiple Regression Analysis

Multiple Regression Analysis y = β0 + β1x1 + β2x2 + . . . βkxk + u 4. Further Issues 0 Redefining Variables   Suppose we have a model with a variab...
Author: Alexis Merritt
Multiple Regression Analysis y = β0 + β1x1 + β2x2 + . . . βkxk + u 4. Further Issues

0

Redefining Variables   Suppose we have a model with a variable like

income measured in dollars on the left-hand-side. Now we re-define income to be measured in tens of thousands of dollars. What effect will this have on estimation and inference?   It will not affect the R2   Will such scaling have any effect on t-stats, Fstats and confidence intervals? !

No, these will also have the same interpretation

Changing the scale of the y variable just leads to a corresponding change in the scale of the coefficients

1

Redefining Variables (cont)   Suppose we originally obtain ! = βˆ + βˆ sqrft + βˆ bdrms, where hprice 0 1 2 βˆ = −19300, βˆ = 128, βˆ = 15200 and 0

1

2

se(βˆ 0 ) = 31000, se(βˆ1 ) = 14, se(βˆ 2 ) = 9480, and R 2 = 0.63

In this specification, house price is measured in dollars.   What happens if we re-estimate this with house price measured in thousands of dollars?

2

Redefining Variables (cont)  If we measure price in thousands of dollars,

the new coefficient will be the old coefficient divided by 1000 (same estimated effect!)  The standard errors will be 1000 times smaller  t-stats etc. will be identical

! = βˆ + βˆ sqrft + βˆ bdrms, where hprice 0 1 2 βˆ = −19.3, βˆ = 0.128, βˆ = 15.2 and 0

1

2

3 se(βˆ 0 ) = 31, se(βˆ1 ) = 0.014, se(βˆ 2 ) = 9.48, and R 2 = 0.63

Redefining Variables (cont)   Changing the scale of one of the x variables: What if we redefine square feet as thousands of square feet? Now all the β’s have the same interpretation as before with the exception of β1-hat

ˆ ... ˆ = βˆ 0 + βˆ1 sqft hprice + β 2 1000   It will be 1000 times larger !

Why? Because now a 1 unit change in square feet is the same as what previously was a 1000 unit change in square feet.

The standard error will also be 1000 times larger and tstats etc. will have the same interpretation 4

Functional Form   OLS can be used for relationships that are not strictly linear in x and y by using nonlinear functions of x and y – will still be linear in the parameters Example: log(wage)= β0 + β1(educ)+β2(exper)+β3 (exper)2 In this particular specification we have an example of a log specification with a quadratic term--both are examples of nonlinearities that can be introduced into the standard linear regression model 5

Interpretation of Log Models 1. If the model is ln(y) = β0 + β1ln(x) + u, then β1 is an elasticity. e.g. if we obtained an estimate of 1.2, this would suggest that a 1 percent increase in x causes y to increase by 1.2 percent. 2. If the model is ln(y) = β0 + β1x + u, then β1*100 is the percent change in y resulting from a unit change in x. e.g. if we obtained an estimate of 0.05, this would suggest that a 1 unit increase in x causes a 5% increase in y. 3. If the model is y = β0 + β1ln(x) + u, then β1/100 is the unit change in y resulting from a 1 percent change in x. e.g. if we obtained an estimate of 20, this would suggest that a 1 percent increase in x causes a 0.2 unit increase in y. 6

Why use log models?   Log-log models are invariant to the scale of the

variables since we’re measuring percent changes   They can give a direct estimate of elasticity   For models with y > 0, the conditional distribution is often heteroskedastic or skewed, while ln(y) is much less so   The distribution of ln(y) is more narrow, limiting the effect of outliers

7

Some Rules of Thumb What types of variables are often used in log form? *Variables in positive dollar amounts *Variables measuring numbers of people -school enrollments, population, # employees *Variables subject to extreme outliers What types of variables are often used in level form? *Anything that takes on a negative or zero value *Variables measured in years

8

Quadratic Models   Captures increasing or decreasing marginal effects   For a model of the form y = β0 + β1x + β2x2 + u, we can’t interpret β1 alone as measuring the change in y with respect to x. dy = β1 + 2 β 2 x dx

Now the effect of an extra unit of x on y depends in part on the value of x. Suppose β1 is positive. Then if β2 is positive, an extra unit of x has a larger impact on y when x is big than when x is small. If β2 is negative, an extra unit of x has a smaller impact on y (or a more negative impact on y) when x is big than when x is small. 9

More on Quadratic Models   Suppose that the coefficient on x is positive and the coefficient on x2 is negative   Then y is increasing in x at first, but will eventually turn around and be decreasing in x   We may want to know the point of inflection ! = 3.73 + 0.298exper - 0.0061exper 2 wage The turning point will be at x * ≈ 24.4, where dw/dx = 0.

10

More on Quadratic Models   Suppose that the coefficient on x is negative and the coefficient on x2 is positive   Then y is decreasing in x at first, but will eventually turn around and be increasing in x ! =13.39 − 0.902 log(nox) − 0.087 log(dist) log(price) −0.545(rooms) + 0.062(rooms)2 − 0.048(stratio) ∂y * The turning point will be at r ≈ 4.4, where = 0. ∂r

11

Interaction Terms   We might think that the marginal effect of one RHS variable depends on another RHS variable Example: suppose the model can be written: y = β0 + β1x1 + β2x2 + β3x1x2 + u   Where y is house price, x1 is the number of square feet and x2 is the number of bedrooms.   So the effect of an extra bedroom on price is ∂y = β 2 + β 3 x1 ∂x2 12

Interaction Terms   If β3>0, this tells us that an extra bedroom boosts the price of a house more, if the square footage of the house is higher. !

This shouldn’t be surprising. After all, an extra bedroom in a small house is likely to be small compared with an extra bedroom in a large house. So we would expect an extra bedroom in a big house to be worth more.

Note that this makes interpretation of β2 a bit less straightforward. !

!

!

Technically, β2 tells us how much an extra bedroom is worth in a house with zero square feet. It may be useful to report on the value of β2+β3x1 for the mean value of x1 . Or redefine to x1 be deviations of square footage from the mean (so that negative values imply smaller than average houses; positive values imply larger than average houses) 13

More on Goodness-of-Fit: Adjusted R-Squared   Recall the R2 will always increase (or at least stay the same) as we add more variables to the model   The “adjusted R2“ takes into account the number of variables in a model, and may decrease when variables are added.   The usual R2 can be written: R

2

SSR n ] [ ≡ 1− , where SSR / n is a biased estimate of σ [ SST n ]

2 u

and SST / n is a biased estimate of σ y2 14

Adjusted R-Squared (cont)   We can define the “population R-squared” as 2 σ ρ 2 = 1 − u2 σy

We can use SSR/(n-k-1) as unbiased estimate of σu2   Similarly can use SST/(n-1) as unbiased estimate of σy2   Therefore, adjusted R2, or “R-bar squared” is: R 2 = 1 − [SSR / (n − k − 1)] / [SST / (n − 1)] = 1 − σˆ 2 / [SST / (n − 1)] 15

Adjusted R-Squared (cont)   Notice that R 2 can go up or down when a variable

is added, unlike the regular R-squared which always goes up   R 2 is not necessarily “better”- the ratio of 2 € unbiased estimators isn’t necessarily unbiased   Better to treat it as an alternative way of summarizing goodness of fit !

If you add a variable to the RHS and the R 2 doesn’t rise, this is likely (though not surely) an indication it shouldn’t be included in the model € 16

Comparing Nested Models  Suppose you wanted to compare the following two models:

1. y=β0+β1x+u 2. y=β0+β1x+β2x2+ u We say that (1) is nested in (2); alternatively, (1) is a special case of (2). With a t-test on β2 we can choose between these two models (if reject null of β2=0, we pick model 2). For multiple exclusion restrictions can use F-test. 17

Comparing Non-Nested Models   Suppose you wanted to compare the following two models: 1. y=β0+β1log(x)+u 2. y=β0+β1x+ β2x2+ v

One is not nested in the other, so t-test or F-test cannot be

used to compare.   Here R-bar-squared can be useful. We can simply choose the model with the higher R-bar-squared. !

!

Note that a simple comparison of regular R-squared would tend to lead us to choose the model with more explanatory variables. Note that if the LHS variable takes a different form between (1) and (2) we cannot compare using R-bar-squared (or R-squared). 18

iClickers   Imagine you want to compare the following two models: 1. y=β0+β1x+β2x2+β3x3+u 2. y=β0+β1x+ v

Suppose you want to test whether the model should be

linear or cubic (i.e., whether model 2 is more appropriate than model 1). Question: What would be the most helpful to look at, in order to make this judgment? 1)  2 t-tests (one on beta2=0 and one on beta3=0) 2)  An F-test (on beta2=0 AND beta3=0, jointly) 3)  The unadjusted R-squared for both models. 4)  The adjusted R-squared for both models. 19

Goodness of Fit   Important not to fixate too much on adj-R2 and

lose sight of theory and common sense   If economic theory clearly predicts a variable belongs, generally leave it in   Don’t want to exclude a variable that prohibits a sensible interpretation of the variable of interest   Remember the ceteris paribus interpretation of multiple regression

20

Residual Analysis   Sometimes looking at the residuals (i.e. predicted y – observed y) provides useful information   Example: Regress price of cars on characteristics !

Engine size, efficiency, luxury amenities, roominess, fuel efficiency, etc.

Then the residual = actual price - predicted price !

By picking the car with the lowest (most negative) residual, you would be choosing the most underpriced car (assuming you’re controlling for all relevant characteristics) 21

iClickers   Question: What day of the week is today? A) Monday B) Tuesday C) Wednesday D) Thursday E) Friday

22