REGRESSION Simple Linear Regression Model 12.2 Fitting the Regression Line 12.3 Inferences on the Slope Parameter

Goldsman — ISyE 6739 Linear Regression REGRESSION 12.1 Simple Linear Regression Model 12.2 Fitting the Regression Line 12.3 Inferences on the Slope...
Goldsman — ISyE 6739

Linear Regression

REGRESSION

12.1 Simple Linear Regression Model 12.2 Fitting the Regression Line 12.3 Inferences on the Slope Parameter

1

Goldsman — ISyE 6739

12.1 Simple Linear Regression Model

Suppose we have a data set with the following paired observations: (x1, y1), (x2, y2), . . . , (xn, yn) Example: xi = height of person i yi = weight of person i Can we make a model expressing yi as a function of xi ? 2

Goldsman — ISyE 6739

12.1 Simple Linear Regression Model

Estimate yi for fixed xi. Let’s model this with the simple linear regression equation, yi = β0 + β1xi + εi, where β0 and β1 are unknown constants and the error terms are usually assumed to be iid

ε1, . . . , εn ∼ N (0, σ 2) ⇒ yi ∼ N (β0 + β1xi, σ 2). 3

Goldsman — ISyE 6739

12.1 Simple Linear Regression Model

y = β0 + β1x with “high” σ 2

y = β0 + β1x with “low” σ 2

4

Goldsman — ISyE 6739

12.1 Simple Linear Regression Model

Warning! Look at data before you fit a line to it: doesn’t look very linear!

5

Goldsman — ISyE 6739

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

12.1 Simple Linear Regression Model xi Production (\$ million)

yi Electric Usage (million kWh)

4.5 3.6 4.3 5.1 5.6 5.0 5.3 5.8 4.7 5.6 4.9 4.2

2.5 2.3 2.5 2.8 3.0 3.1 3.2 3.5 3.0 3.3 2.7 2.5

6

Goldsman — ISyE 6739

12.1 Simple Linear Regression Model

3.4

yi

3.0

2.6

2.2

3.5

4.0

4.5

5.0

5.5

6.0

xi

Great... but how do you fit the line? 7

Goldsman — ISyE 6739

12.2 Fitting the Regression Line

Fit the regression line y = β0 + β1x to the data (x1, y1), . . . , (xn, yn) by finding the “best” match between the line and the data. The “best”choice of β0, β1 will be chosen to minimize Q=

n X

(yi − (β0 + β1xi

i=1

))2

=

n X

i=1

ε2 i.

8

Goldsman — ISyE 6739

12.2 Fitting the Regression Line

This is called the least square fit. Let’s solve... ∂Q ∂β0 ∂Q ∂β1

P

= −2 (yi − (β0 + β1xi)) = 0 = −2

P

xi(yi − (β0 + β1xi)) = 0

P

P

yi = nβ0 + β1 xi P P xiyi = −2 xi(yi − (β0 + β1xi)) = 0

After a little algebra, get ˆ1 = β

n

P

P

P

xiyi−( xi)( yi) P P n x2 −( xi)2 i

P P 1 1 ˆ ˆ β0 = y ¯ − β1x ¯, where y¯ ≡ n yi and x ¯ ≡ n xi . 9

Goldsman — ISyE 6739

12.2 Fitting the Regression Line

Let’s introduce some more notation: Sxx = = Sxy = =

P P P

P

(xi x2 i

−x ¯)2 −

P

(

=

xi)2 n

P

2 x2 − n¯ x i

(xi − x ¯)(yi − y¯) = xiyi −

P

(

P

xi)( n

yi )

P

xiyi − n¯ xy¯

These are called “sums of squares.”

10

Goldsman — ISyE 6739

12.2 Fitting the Regression Line

Then, after a little more algebra, we can write βˆ1 =

Sxy Sxx

Fact: If the εi’s are iid N (0, σ 2), it can be shown that βˆ0 and βˆ1 are the MLE’s for βˆ0 and βˆ1, respectively. (See text for easy proof). Anyhow, the fitted regression line is: yˆ = βˆ0 + βˆ1x. 11

Goldsman — ISyE 6739

12.1 Simple Linear Regression Model

Fix a specific value of the explanatory variable x∗, the equation gives a fitted value yˆ|x∗ = βˆ0 + βˆ1x∗ for the dependent variable y.

12

yˆ = βˆ0 + βˆ1x

yˆ|x∗

x x∗

xi

Goldsman — ISyE 6739

12.2 Fitting the Regression Line

For actual data points xi, the fitted values are yˆi = βˆ0 + βˆ1xi. observed values : yi = β0 + β1xi + εi fitted values

ˆ1xi : yˆi = βˆ0 + β

Let’s estimate the error variation σ 2 by considering the deviations between yi and yˆi. SSE = =

P 2 ˆ1xi))2 (yi − yˆi) = (yi − (βˆ0 + β P 2 P P yi − βˆ0 yi − βˆ1 xiyi. P

13

Goldsman — ISyE 6739

12.2 Fitting the Regression Line

2. Turns out that σ ˆ2 ≡ SSE is a good estimator for σ n−2 P12 Example: Car plant energy usage n = 12, i=1 xi =

58.62,

P

P

yi = 34.15,

xiyi = 169.253

P

x2 i = 291.231,

P

yi2 = 98.697,

βˆ1 = 0.49883, βˆ0 = 0.4090 ⇒ fitted regression line is yˆ = 0.409 + 0.499x yˆ|5.5 = 3.1535 What about something like y ˆ|10.0? 14

Goldsman — ISyE 6739

ˆ1 β

=

Sxy = =

Sxy Sxx , P

where Sxx =

12.3 Inferences on Slope Parameter β1

P

(xi − x ¯)(yi − y¯) =

P

(xi − x ¯)yi

(xi − x ¯)2 and P

P

(xi − x ¯)yi − y¯ (xi − x ¯)

15

Goldsman — ISyE 6739

12.3 Inferences on Slope Parameter β1

Since the yi’s are independent with yi ∼ N(β0+β1xi, σ 2) (and the xi’s are constants), we have Eβˆ1 = S1xx ESxy = S1xx = =

(xi − x ¯)Eyi = X1xx

P

P

P 1 [β X(x − x ¯ ) +β ¯)xi] i 1 (xi − x Sxx 0 | {z } 0 P β1 β1 X 2 2) 2 (x − x x ¯ ) = ( x − n¯ x i i {z i Sxx Sxx | } Sxx

(xi − x ¯)(β0 + β1x

= β1

⇒ βˆ1 is an unbiased estimator of β1.

16

Goldsman — ISyE 6739

12.3 Inferences on Slope Parameter β1

Further, since βˆ1 is a linear combination of independent normals, βˆ1 is itself normal. We can also derive 2 1 1 X σ Var(βˆ1) = 2 Var(Sxy ) = 2 (xi −¯ x)2Var(yi) = . Sxx Sxx Sxx 2

Thus, βˆ1 ∼ N(β1, Sσxx )

17

Goldsman — ISyE 6739

12.3 Inferences on Slope Parameter β1

While we’re at it, we can do the same kind of thing with the intercept parameter, β0: ˆ0 = y¯ − βˆ1x β ¯ Thus, Eβˆ0 = E¯ y−x ¯Eβˆ1 = β0 + β1x ¯− x ¯β1 = β0 Similar to before, since βˆ0 is a linear combination of independent normals, it is also normal. Finally, x2 i σ 2. Var(βˆ0) = nSxx P

18

Goldsman — ISyE 6739

12.3 Inferences on Slope Parameter β1

Proof: P 1 ˆ y , (xi − x ¯)yi) Cov(¯ y , β1) = Sxx Cov(¯ P x i −¯ = (x y , yi) Sxx )Cov(¯

=

P

(xi−¯ x) σ 2 Sxx n

= 0

⇒ Var(βˆ0) = Var(¯ y − βˆ1x ¯) ˆ1) = Var(¯ y) + x ¯2Varβˆ1 − 2¯ x Cov(¯ y , β {z } | σ2

σ2

0

= n +x ¯2 Sxx   2 −n¯ x . = σ 2 Sxx nSxx

Thus, βˆ0 ∼

P 2 xi 2 N(β0, nSxx σ ). 19

Goldsman — ISyE 6739

12.3 Inferences on Slope Parameter β1

Back to βˆ1 ∼ N(β1, σ 2/Sxx) . . . ⇒ Turns out:

βˆ1 − β1

q

σ 2/Sxx

∼ N(0, 1)

σ 2χ2(n−2) SSE (1) = n−2 ∼ ; n−2 (2) σ ˆ2 is independent of βˆ1.

σ ˆ2

20

Goldsman — ISyE 6739

βˆ1√ −β1 σ/ Sxx

σ ˆ/σ ⇒

12.3 Inferences on Slope Parameter β1

N(0, 1) s

χ2(n−2) n−2

∼ t(n − 2)

ˆ1 − β1 β √ ∼ t(n − 2). σ ˆ/ Sxx

21

Goldsman — ISyE 6739

12.3 Inferences on Slope Parameter β1

t(n − 2)

1−α

−tα/2,n−2

tα/2,n−2 22

Goldsman — ISyE 6739

12.3 Inferences on Slope Parameter β1

2-sided Confidence Intervals for β1: ˆ −β1 1 − α = Pr(−tα/2,n−2 ≤ β1√ ≤ tα/2,n−2) σ ˆ/ Sxx ˆ ) = Pr(βˆ1 − tα/2,n−2 √ σˆ ≤ β1 ≤ βˆ1 + tα/2,n−2 √ σ Sxx

Sxx

1-sided CI’s for β1: ˆ ) β1 ∈ (−∞, βˆ1 + tα,n−2 √ σ Sxx ˆ , ∞) β1 ∈ (βˆ1 − tα,n−2 √ σ Sxx

23