Residual Analysis. Inferences about a regression model are valid only under assumptions about the random errors in the observations
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Residual Analysis
Inferences about a regression ...
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Residual Analysis
Inferences about a regression model are valid only under assumptions about the random errors in the observations. Objectives: Show how residuals reveal departures from assumptions; Suggest procedures for coping with such departures.
1 / 17
Residual Analysis
Introduction
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Regression Residuals The random errors satisfy Y = E (Y ) + , or = Y − E (Y ). We observe Y , but we do not know E (Y ), so we cannot calculate . We estimate E (Y ) by yˆ, the predicted (or fitted) value. We approximate the random errors by regression residuals: ˆi = yi − yˆi ,
2 / 17
i = 1, 2, . . . , n.
Residual Analysis
Regression Residuals
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Properties of residuals If the model contains an intercept, the sum of the residuals, and also their mean, is zero: n X
ˆi = 0, and so ¯ˆ = 0.
i=1
The covariance of the residuals and any term in the regression model is zero: n X ˆi xi,j = 0, j = 1, 2, . . . , k. i=1
3 / 17
Residual Analysis
Properties of residuals
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Detecting Lack of Fit A misspecified model is one that leaves out a relevant predictor. The residuals from a misspecified model do not have mean zero. Example: serum cholesterol (y ) and dietary fat (x) in Olympic athletes. ath