Multiple Regression Analysis y = β0 + β1x1 + β2x2 + . . . βkxk + u 4. Further Issues
Redefining Variables Suppose we have a model with a variable like
income measured in dollars on the left-hand-side. Now we re-define income to be measured in tens of thousands of dollars. What effect will this have on estimation and inference? It will not affect the R2 Will such scaling have any effect on t-stats, Fstats and confidence intervals? !
No, these will also have the same interpretation
Changing the scale of the y variable just leads to a corresponding change in the scale of the coefficients
Redefining Variables (cont) Suppose we originally obtain ! = βˆ + βˆ sqrft + βˆ bdrms, where hprice 0 1 2 βˆ = −19300, βˆ = 128, βˆ = 15200 and 0
se(βˆ 0 ) = 31000, se(βˆ1 ) = 14, se(βˆ 2 ) = 9480, and R 2 = 0.63
In this specification, house price is measured in dollars. What happens if we re-estimate this with house price measured in thousands of dollars?
Redefining Variables (cont) If we measure price in thousands of dollars,
the new coefficient will be the old coefficient divided by 1000 (same estimated effect!) The standard errors will be 1000 times smaller t-stats etc. will be identical
! = βˆ + βˆ sqrft + βˆ bdrms, where hprice 0 1 2 βˆ = −19.3, βˆ = 0.128, βˆ = 15.2 and 0
3 se(βˆ 0 ) = 31, se(βˆ1 ) = 0.014, se(βˆ 2 ) = 9.48, and R 2 = 0.63
Redefining Variables (cont) Changing the scale of one of the x variables: What if we redefine square feet as thousands of square feet? Now all the β’s have the same interpretation as before with the exception of β1-hat
ˆ ... ˆ = βˆ 0 + βˆ1 sqft hprice + β 2 1000 It will be 1000 times larger !
Why? Because now a 1 unit change in square feet is the same as what previously was a 1000 unit change in square feet.
The standard error will also be 1000 times larger and tstats etc. will have the same interpretation 4
Functional Form OLS can be used for relationships that are not strictly linear in x and y by using nonlinear functions of x and y – will still be linear in the parameters Example: log(wage)= β0 + β1(educ)+β2(exper)+β3 (exper)2 In this particular specification we have an example of a log specification with a quadratic term--both are examples of nonlinearities that can be introduced into the standard linear regression model 5
Interpretation of Log Models 1. If the model is ln(y) = β0 + β1ln(x) + u, then β1 is an elasticity. e.g. if we obtained an estimate of 1.2, this would suggest that a 1 percent increase in x causes y to increase by 1.2 percent. 2. If the model is ln(y) = β0 + β1x + u, then β1*100 is the percent change in y resulting from a unit change in x. e.g. if we obtained an estimate of 0.05, this would suggest that a 1 unit increase in x causes a 5% increase in y. 3. If the model is y = β0 + β1ln(x) + u, then β1/100 is the unit change in y resulting from a 1 percent change in x. e.g. if we obtained an estimate of 20, this would suggest that a 1 percent increase in x causes a 0.2 unit increase in y. 6
Why use log models? Log-log models are invariant to the scale of the
variables since we’re measuring percent changes They can give a direct estimate of elasticity For models with y > 0, the conditional distribution is often heteroskedastic or skewed, while ln(y) is much less so The distribution of ln(y) is more narrow, limiting the effect of outliers
Some Rules of Thumb What types of variables are often used in log form? *Variables in positive dollar amounts *Variables measuring numbers of people -school enrollments, population, # employees *Variables subject to extreme outliers What types of variables are often used in level form? *Anything that takes on a negative or zero value *Variables measured in years
Quadratic Models Captures increasing or decreasing marginal effects For a model of the form y = β0 + β1x + β2x2 + u, we can’t interpret β1 alone as measuring the change in y with respect to x. dy = β1 + 2 β 2 x dx
Now the effect of an extra unit of x on y depends in part on the value of x. Suppose β1 is positive. Then if β2 is positive, an extra unit of x has a larger impact on y when x is big than when x is small. If β2 is negative, an extra unit of x has a smaller impact on y (or a more negative impact on y) when x is big than when x is small. 9
More on Quadratic Models Suppose that the coefficient on x is positive and the coefficient on x2 is negative Then y is increasing in x at first, but will eventually turn around and be decreasing in x We may want to know the point of inflection ! = 3.73 + 0.298exper - 0.0061exper 2 wage The turning point will be at x * ≈ 24.4, where dw/dx = 0.
More on Quadratic Models Suppose that the coefficient on x is negative and the coefficient on x2 is positive Then y is decreasing in x at first, but will eventually turn around and be increasing in x ! =13.39 − 0.902 log(nox) − 0.087 log(dist) log(price) −0.545(rooms) + 0.062(rooms)2 − 0.048(stratio) ∂y * The turning point will be at r ≈ 4.4, where = 0. ∂r
Interaction Terms We might think that the marginal effect of one RHS variable depends on another RHS variable Example: suppose the model can be written: y = β0 + β1x1 + β2x2 + β3x1x2 + u Where y is house price, x1 is the number of square feet and x2 is the number of bedrooms. So the effect of an extra bedroom on price is ∂y = β 2 + β 3 x1 ∂x2 12
Interaction Terms If β3>0, this tells us that an extra bedroom boosts the price of a house more, if the square footage of the house is higher. !
This shouldn’t be surprising. After all, an extra bedroom in a small house is likely to be small compared with an extra bedroom in a large house. So we would expect an extra bedroom in a big house to be worth more.
Note that this makes interpretation of β2 a bit less straightforward. !
Technically, β2 tells us how much an extra bedroom is worth in a house with zero square feet. It may be useful to report on the value of β2+β3x1 for the mean value of x1 . Or redefine to x1 be deviations of square footage from the mean (so that negative values imply smaller than average houses; positive values imply larger than average houses) 13
More on Goodness-of-Fit: Adjusted R-Squared Recall the R2 will always increase (or at least stay the same) as we add more variables to the model The “adjusted R2“ takes into account the number of variables in a model, and may decrease when variables are added. The usual R2 can be written: R
SSR n ] [ ≡ 1− , where SSR / n is a biased estimate of σ [ SST n ]
and SST / n is a biased estimate of σ y2 14
Adjusted R-Squared (cont) We can define the “population R-squared” as 2 σ ρ 2 = 1 − u2 σy
We can use SSR/(n-k-1) as unbiased estimate of σu2 Similarly can use SST/(n-1) as unbiased estimate of σy2 Therefore, adjusted R2, or “R-bar squared” is: R 2 = 1 − [SSR / (n − k − 1)] / [SST / (n − 1)] = 1 − σˆ 2 / [SST / (n − 1)] 15
Adjusted R-Squared (cont) Notice that R 2 can go up or down when a variable
is added, unlike the regular R-squared which always goes up R 2 is not necessarily “better”- the ratio of 2 € unbiased estimators isn’t necessarily unbiased Better to treat it as an alternative way of summarizing goodness of fit !
If you add a variable to the RHS and the R 2 doesn’t rise, this is likely (though not surely) an indication it shouldn’t be included in the model € 16
Comparing Nested Models Suppose you wanted to compare the following two models:
1. y=β0+β1x+u 2. y=β0+β1x+β2x2+ u We say that (1) is nested in (2); alternatively, (1) is a special case of (2). With a t-test on β2 we can choose between these two models (if reject null of β2=0, we pick model 2). For multiple exclusion restrictions can use F-test. 17
Comparing Non-Nested Models Suppose you wanted to compare the following two models: 1. y=β0+β1log(x)+u 2. y=β0+β1x+ β2x2+ v
One is not nested in the other, so t-test or F-test cannot be
used to compare. Here R-bar-squared can be useful. We can simply choose the model with the higher R-bar-squared. !
Note that a simple comparison of regular R-squared would tend to lead us to choose the model with more explanatory variables. Note that if the LHS variable takes a different form between (1) and (2) we cannot compare using R-bar-squared (or R-squared). 18
iClickers Imagine you want to compare the following two models: 1. y=β0+β1x+β2x2+β3x3+u 2. y=β0+β1x+ v
Suppose you want to test whether the model should be
linear or cubic (i.e., whether model 2 is more appropriate than model 1). Question: What would be the most helpful to look at, in order to make this judgment? 1) 2 t-tests (one on beta2=0 and one on beta3=0) 2) An F-test (on beta2=0 AND beta3=0, jointly) 3) The unadjusted R-squared for both models. 4) The adjusted R-squared for both models. 19
Goodness of Fit Important not to fixate too much on adj-R2 and
lose sight of theory and common sense If economic theory clearly predicts a variable belongs, generally leave it in Don’t want to exclude a variable that prohibits a sensible interpretation of the variable of interest Remember the ceteris paribus interpretation of multiple regression
Residual Analysis Sometimes looking at the residuals (i.e. predicted y – observed y) provides useful information Example: Regress price of cars on characteristics !
Engine size, efficiency, luxury amenities, roominess, fuel efficiency, etc.
Then the residual = actual price - predicted price !
By picking the car with the lowest (most negative) residual, you would be choosing the most underpriced car (assuming you’re controlling for all relevant characteristics) 21
iClickers Question: What day of the week is today? A) Monday B) Tuesday C) Wednesday D) Thursday E) Friday