Goldsman — ISyE 6739
Linear Regression
REGRESSION
12.1 Simple Linear Regression Model 12.2 Fitting the Regression Line 12.3 Inferences on the Slope Parameter
1
Goldsman — ISyE 6739
12.1 Simple Linear Regression Model
Suppose we have a data set with the following paired observations: (x1, y1), (x2, y2), . . . , (xn, yn) Example: xi = height of person i yi = weight of person i Can we make a model expressing yi as a function of xi ? 2
Goldsman — ISyE 6739
12.1 Simple Linear Regression Model
Estimate yi for fixed xi. Let’s model this with the simple linear regression equation, yi = β0 + β1xi + εi, where β0 and β1 are unknown constants and the error terms are usually assumed to be iid
ε1, . . . , εn ∼ N (0, σ 2) ⇒ yi ∼ N (β0 + β1xi, σ 2). 3
Goldsman — ISyE 6739
12.1 Simple Linear Regression Model
y = β0 + β1x with “high” σ 2
y = β0 + β1x with “low” σ 2
4
Goldsman — ISyE 6739
12.1 Simple Linear Regression Model
Warning! Look at data before you fit a line to it: doesn’t look very linear!
5
Goldsman — ISyE 6739
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
12.1 Simple Linear Regression Model xi Production ($ million)
yi Electric Usage (million kWh)
4.5 3.6 4.3 5.1 5.6 5.0 5.3 5.8 4.7 5.6 4.9 4.2
2.5 2.3 2.5 2.8 3.0 3.1 3.2 3.5 3.0 3.3 2.7 2.5
6
Goldsman — ISyE 6739
12.1 Simple Linear Regression Model
3.4
yi
3.0
2.6
2.2
3.5
4.0
4.5
5.0
5.5
6.0
xi
Great... but how do you fit the line? 7
Goldsman — ISyE 6739
12.2 Fitting the Regression Line
Fit the regression line y = β0 + β1x to the data (x1, y1), . . . , (xn, yn) by finding the “best” match between the line and the data. The “best”choice of β0, β1 will be chosen to minimize Q=
n X
(yi − (β0 + β1xi
i=1
))2
=
n X
i=1
ε2 i.
8
Goldsman — ISyE 6739
12.2 Fitting the Regression Line
This is called the least square fit. Let’s solve... ∂Q ∂β0 ∂Q ∂β1
⇔
P
= −2 (yi − (β0 + β1xi)) = 0 = −2
P
xi(yi − (β0 + β1xi)) = 0
P
P
yi = nβ0 + β1 xi P P xiyi = −2 xi(yi − (β0 + β1xi)) = 0
After a little algebra, get ˆ1 = β
n
P
P
P
xiyi−( xi)( yi) P P n x2 −( xi)2 i
P P 1 1 ˆ ˆ β0 = y ¯ − β1x ¯, where y¯ ≡ n yi and x ¯ ≡ n xi . 9
Goldsman — ISyE 6739
12.2 Fitting the Regression Line
Let’s introduce some more notation: Sxx = = Sxy = =
P P P
P
(xi x2 i
−x ¯)2 −
P
(
=
xi)2 n
P
2 x2 − n¯ x i
(xi − x ¯)(yi − y¯) = xiyi −
P
(
P
xi)( n
yi )
P
xiyi − n¯ xy¯
These are called “sums of squares.”
10
Goldsman — ISyE 6739
12.2 Fitting the Regression Line
Then, after a little more algebra, we can write βˆ1 =
Sxy Sxx
Fact: If the εi’s are iid N (0, σ 2), it can be shown that βˆ0 and βˆ1 are the MLE’s for βˆ0 and βˆ1, respectively. (See text for easy proof). Anyhow, the fitted regression line is: yˆ = βˆ0 + βˆ1x. 11
Goldsman — ISyE 6739
12.1 Simple Linear Regression Model
Fix a specific value of the explanatory variable x∗, the equation gives a fitted value yˆ|x∗ = βˆ0 + βˆ1x∗ for the dependent variable y.
12
yˆ
yˆ = βˆ0 + βˆ1x
yˆ|x∗
x x∗
xi
Goldsman — ISyE 6739
12.2 Fitting the Regression Line
For actual data points xi, the fitted values are yˆi = βˆ0 + βˆ1xi. observed values : yi = β0 + β1xi + εi fitted values
ˆ1xi : yˆi = βˆ0 + β
Let’s estimate the error variation σ 2 by considering the deviations between yi and yˆi. SSE = =
P 2 ˆ1xi))2 (yi − yˆi) = (yi − (βˆ0 + β P 2 P P yi − βˆ0 yi − βˆ1 xiyi. P
13
Goldsman — ISyE 6739
12.2 Fitting the Regression Line
2. Turns out that σ ˆ2 ≡ SSE is a good estimator for σ n−2 P12 Example: Car plant energy usage n = 12, i=1 xi =
58.62,
P
P
yi = 34.15,
xiyi = 169.253
P
x2 i = 291.231,
P
yi2 = 98.697,
βˆ1 = 0.49883, βˆ0 = 0.4090 ⇒ fitted regression line is yˆ = 0.409 + 0.499x yˆ|5.5 = 3.1535 What about something like y ˆ|10.0? 14
Goldsman — ISyE 6739
ˆ1 β
=
Sxy = =
Sxy Sxx , P
where Sxx =
12.3 Inferences on Slope Parameter β1
P
(xi − x ¯)(yi − y¯) =
P
(xi − x ¯)yi
(xi − x ¯)2 and P
P
(xi − x ¯)yi − y¯ (xi − x ¯)
15
Goldsman — ISyE 6739
12.3 Inferences on Slope Parameter β1
Since the yi’s are independent with yi ∼ N(β0+β1xi, σ 2) (and the xi’s are constants), we have Eβˆ1 = S1xx ESxy = S1xx = =
(xi − x ¯)Eyi = X1xx
P
P
P 1 [β X(x − x ¯ ) +β ¯)xi] i 1 (xi − x Sxx 0 | {z } 0 P β1 β1 X 2 2) 2 (x − x x ¯ ) = ( x − n¯ x i i {z i Sxx Sxx | } Sxx
(xi − x ¯)(β0 + β1x
= β1
⇒ βˆ1 is an unbiased estimator of β1.
16
Goldsman — ISyE 6739
12.3 Inferences on Slope Parameter β1
Further, since βˆ1 is a linear combination of independent normals, βˆ1 is itself normal. We can also derive 2 1 1 X σ Var(βˆ1) = 2 Var(Sxy ) = 2 (xi −¯ x)2Var(yi) = . Sxx Sxx Sxx 2
Thus, βˆ1 ∼ N(β1, Sσxx )
17
Goldsman — ISyE 6739
12.3 Inferences on Slope Parameter β1
While we’re at it, we can do the same kind of thing with the intercept parameter, β0: ˆ0 = y¯ − βˆ1x β ¯ Thus, Eβˆ0 = E¯ y−x ¯Eβˆ1 = β0 + β1x ¯− x ¯β1 = β0 Similar to before, since βˆ0 is a linear combination of independent normals, it is also normal. Finally, x2 i σ 2. Var(βˆ0) = nSxx P
18
Goldsman — ISyE 6739
12.3 Inferences on Slope Parameter β1
Proof: P 1 ˆ y , (xi − x ¯)yi) Cov(¯ y , β1) = Sxx Cov(¯ P x i −¯ = (x y , yi) Sxx )Cov(¯
=
P
(xi−¯ x) σ 2 Sxx n
= 0
⇒ Var(βˆ0) = Var(¯ y − βˆ1x ¯) ˆ1) = Var(¯ y) + x ¯2Varβˆ1 − 2¯ x Cov(¯ y , β {z } | σ2
σ2
0
= n +x ¯2 Sxx 2 −n¯ x . = σ 2 Sxx nSxx
Thus, βˆ0 ∼
P 2 xi 2 N(β0, nSxx σ ). 19
Goldsman — ISyE 6739
12.3 Inferences on Slope Parameter β1
Back to βˆ1 ∼ N(β1, σ 2/Sxx) . . . ⇒ Turns out:
βˆ1 − β1
q
σ 2/Sxx
∼ N(0, 1)
σ 2χ2(n−2) SSE (1) = n−2 ∼ ; n−2 (2) σ ˆ2 is independent of βˆ1.
σ ˆ2
20
Goldsman — ISyE 6739
⇒
βˆ1√ −β1 σ/ Sxx
σ ˆ/σ ⇒
12.3 Inferences on Slope Parameter β1
∼
N(0, 1) s
χ2(n−2) n−2
∼ t(n − 2)
ˆ1 − β1 β √ ∼ t(n − 2). σ ˆ/ Sxx
21
Goldsman — ISyE 6739
12.3 Inferences on Slope Parameter β1
t(n − 2)
1−α
−tα/2,n−2
tα/2,n−2 22
Goldsman — ISyE 6739
12.3 Inferences on Slope Parameter β1
2-sided Confidence Intervals for β1: ˆ −β1 1 − α = Pr(−tα/2,n−2 ≤ β1√ ≤ tα/2,n−2) σ ˆ/ Sxx ˆ ) = Pr(βˆ1 − tα/2,n−2 √ σˆ ≤ β1 ≤ βˆ1 + tα/2,n−2 √ σ Sxx
Sxx
1-sided CI’s for β1: ˆ ) β1 ∈ (−∞, βˆ1 + tα,n−2 √ σ Sxx ˆ , ∞) β1 ∈ (βˆ1 − tα,n−2 √ σ Sxx
23