Lecture 4: ANOVA Table

STAT 512 Spring 2011

Background Reading KNNL: 2.6-2.7

4-1

Topic Overview  Working-Hotelling Confidence Band  Inference Example using SAS  ANOVA Table

4-2

Working-Hotelling Confidence Band (1)  This gives a confidence limit for the whole line at once, in contrast to the confidence interval for just one Yˆh at a time.  Regression line b0  b1 X h describes E Yh  for given X h .  We have 95% CI for specific X h .

E(Yh ) = Yˆh

pertaining to 4-3

Working-Hotelling Confidence Band (2)  We want a 95% Confidence band for all Xh – this is a confidence limit for the whole line at once, in contrast to the confidence interval for ˆ Y just one h at a time.

 ,  The confidence limit is given by 2 W  2F 1  ;2, n  2  . Since we are doing where all values of X h at once, it will be wider at each X h than CIs for individual X h . Yˆh  W s Yˆh

4-4

Working-Hotelling Confidence Band (3)  We are used to constructing CI’s with t’s, not W’s. Can we fake it? c  We can find a new, smaller alpha for t that would give the same results – kind of an “effective alpha” that takes into account that you are estimating the entire line. 2  We find W for our desired true α, and then c t find the effective α to use with t that gives c t W(α) = t (α ). 4-5

SAS Example (musclemass.sas) (Problem 1.27 in KNNL)  Muscle mass is expected to decrease with age. Study explores this relationship in women (n = 60)  15 women randomly selected from each of four age groups 40-49, 50-59, 60-69, 70-79  We will analyze this data set assuming that the simple linear regression model applies. 4-6

Read in the Data  For textbook files – easiest way is to simply open data as text file or through website and paste it into SAS using “datalines”. DATA muscle; input mmass age; datalines; 106 43 106 41 ..... ; 4-7

Produce a Scatter Plot goptions ftitle=centb ftext=swissb htitle=3 htext=1.5 ctitle=blue ctext=black; symbol1 v=dot c=blue ; axis1 label=('Age (Years)'); axis2 label=(angle=90 'Muscle Mass'); PROC GPLOT data=muscle; plot mmass*age /haxis=axis1 vaxis=axis2; title 'Muscle Mass vs Age in women'; RUN; QUIT;

4-8

4-9

Examining Scatter Plots  Form – linear looks mostly reasonable  Direction – muscle mass seems to decrease as age increases  Strength – there is quite a bit of scatter so the relationship is likely weak to moderate

4-10

Regression Model Goals  Estimate the difference in mean muscle mass for women differing in age by 1 year.  Produce CI’s and PI’s for women age 50, 60, and 70  Plot 95% Confidence Band for the regression line.

4-11

Preliminaries DATA slime; age = 50; mmass = .; output; age = 60; mmass = .; output; age = 70; mmass = .; output; DATA muscle; set muscle slime; PROC PRINT; RUN;

This adds to the data set so that we can easily predict for ages of 50, 60, and 70.

4-12

PROC REG PROC REG data=muscle outest=params outseb; model mmass=age /clb clm cli; output out=mean_resp p=predicted stdp=SE_mean lclm = LCL_mean uclm=UCL_mean; output out=predict p=predicted stdi=SE_pred lcl=LCL_pred ucl=UCL_pred; id age; PROC PRINT data=params; PROC PRINT data=mean_resp; where mmass=.; PROC PRINT data=predict; where mmass=.; RUN; 4-13

Output (1) Analysis of Variance

Source Model Error Total Root MSE

DF 1 58 59

Sum of Squares 11627 3875 15502 8.17318

Mean Square 11627 66.8

F Value 174.06

R-Square

Pr > F |t|

28.36 -13.19