STAT  408/608   Guided  Exercise  7  ANSWERS     For  On-­‐Line  Students,  be  sure  to:   Key  Topics   • Submit  your  answers  in  a  Word  file  to  Sakai  at  the   • Single  Factor  ANOVA   same  place  you  downloaded  the  file   • Understanding  the  ANOVA  Table   • Remember  you  can  paste  any  Excel  or  JMP  output  into     a  Word  File  (use  Paste  Special  for  best  results).     • Put  your  name  and  the  Assignment  #  on  the  file  name:     e.g.    Ilvento  Guided7.doc   Answer  as  completely  as  you  can  and  show  your  work.    Upload  your  file  via  Sakai.   1.     This  problem  looks  at  the  salary  differences  of  Male  and  Female  Mid-­‐Level  Managers  at  220  firms.      We  will   compare  a  Difference  of  Means  Test  and  the  ANOVA  approach  using  this  data.    We  will  be  looking  at  an  Excel  file   with  data  on  mid-­‐level  managers  in  220  firms.    The  salary  is  given  in  $1,000s.      We  want  to  look  at  the  female  sample   (n=75)  and  compare  it  to  the  male  mean  level  (144)  to  see  if  it  is  lower  than  that  of  males  (or  if  males  is  higher).    Use   an  alpha  level  of  .05.     Here  are  the  Excel  output  for  descriptive  for  Females,  males,  and  the  total  sample,  as  well  as  the  Difference  of  Means   Test  assuming  equal  variances.   t-Test: Two-Sample Assuming Equal Variances Descriptives

Females

Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count   Confidence Level(95.0%)

140.467 1.443 139.000 146.000 12.496 156.144 -0.104 0.350 54.000 118 172 10535 75 2.875

Males 144.110 1.029 145.000 145.000 12.394 153.613 -0.275 -0.299 61.000 110 171 20896 145 2.034

Salary 142.868 0.844 143.500 145.000 12.521 156.763 -0.397 -0.078 62.000 110.000 172.000 31431.000 220 1.664

Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T  6.01   Reject  Ho:  µ 1  =  µ 2  =  µ 3    

   

c.     R-­‐square  is  a  measure  of  the  explanatory  ability  of  the  model.  It  is  calculated  as  a  proportion  of:  

 

R2 =

SST   SSTotal

 

R2  =  14.149/19.826  =  .71366            71.4%  of  the  variability  in  cost  is  explained  by  the  Car  Type  

 

 

The  interpretation  of  R-­‐Square  is  how  much  of  the  variability  in  the  dependent  variable  is  “explained”  by  the   independent  variable  (the  car  models).    It  ranges  from  0  to  1.0,  with  1.0  meaning  that  all  the  variability  of  the   dependent  variable  is  explained  by  the  independent  variables.    Calculate  and  interpret  R-­‐square  for  this   model.  

€  

  3.   3.   This  is  a  study  to  see  the  effects  of  three  different  pesticides.    It  is  actually  a  block  design,  but  we  will  ignore   the  block  effect  and  save  that  for  later.    The  researcher  hypothesized  that  the  three  insecticides  would  have   different  impacts  on  the  number  of  seedlings  in  a  row.       Factor  =  INSECTICIDE:              the  levels  =  1,  2,  3    (Note:  event  though  these  are  numbers,  this  is  a  nominal  level   variable.    I  could  have  labeled  them  A,  B,  C  or  some  other  name).     She  measured  the  number  of  seedlings  in  a  row.           Response  Variable  =  SEEDLINGS.       The  following  are  the  descriptive  statistics  for  the  Response  Variable,  along  with  a  box  plot  and  means  for  each   insecticide.     Seedlings Quantiles

45

60

75

Summary Statistics

100.0% maximum 99.5% 97.5% 90.0% 75.0% quartile 50.0% median 25.0% quartile 10.0% 2.5% 0.5% 0.0% minimum

90

94 94 94 93.7 84.5 79 63 50.4 48 48 48

Mean Std Dev Std Err Mean Upper 95% Mean Lower 95% Mean N Sum Variance Skewness Kurtosis CV N Missing Median Mode Range Interquartile Range

75 14.447397 4.1706042 84.179438 65.820562 12 900 208.72727 -0.529182 -0.593126 19.263196 0 79 83 46 21.5

   

Stem and Leaf Stem 9 8 8 7 7 6 6 5 5 4

Leaf 34 5 033 8 2 6 2 6

Count 2 1 3 1 1 1 1 1

8

1

4|8 represents 48

 

Oneway Analysis of Seedlings By Insecticide

Seedlings

90 80 70 60 50 1

2

3

Insecticide

Means and Std Deviations Level 1 2 3

Number 4 4 4

Mean 58.0000 87.0000 80.0000

Std Dev 7.83156 7.78888 5.71548

Std Err Mean Lower 95% Upper 95% 3.9158 45.538 70.462 3.8944 74.606 99.394 2.8577 70.905 89.095

        a.     Look  at  the  data  and  graphs  and  briefly  summarize  the  average  seedlings  for  the  different  insecticides.    Note:  the   Box  Plots  show  the  spread  of  the  data  around  the  median.     Insecticide  1  has  a  much  lower  average  seedlings  compared  with  insecticides  2  and  3  (58.0  compared  with  87.0  and   80.0).  The  variances  of  the  three  insecticides  are  very  close  to  each  other.            

  b.       The  following  is  the  output  from  a  JMP  ANOVA      Fill  in  the  blanks  in  the  Analysis  of  Variance  Table.  There  are  5   numbers  to  calculate  -­‐  Error  SS;  Insecticide  d.f.;  MSE,  F*,  and  R2.         Summary  of  Fit           Rsquare   0.798   Adj  Rsquare   0.753   Root  Mean  Square  Error   7.180   Mean  of  Response   75.000   Observations  (or  Sum  Wgts)   12.000     Analysis  of  Variance   Source   DF   Sum  of  Squares   Mean  Square   F  Ratio   Prob  >  F   Insecticide   2   1832.00   916.000   17.7672   0.0007*   Error   9   464.00   51.556       C.  Total   11   2296.00             Error  SS         SSTreatment  –  SSTotal  =  2296.00  –  1832.00  =  464.00   Insecticide  d.f.  

 

k=3  groups    k-­‐1  =  2  d.f.  

MSE        

 

 

464/9  =  51.546  

F  Ratio  (F*)    

 

MSTreatment/MSE  =  916.000/51.556  =  17.7672  

Also,  calculate  R2  for  this  model:        SSTreatment/SSTotal  =  1832.00/2296.00  =  .798     c.     Conduct  a  Test  to  see  if  there  is  a  mean  difference  in  (1,  2,  3).    Use  an  F-­‐test  with  α=  .01.    You  will  need  to  look  up   the  critical  value  of  F  for  2  and  9  d.f.  at  alpha  =  .01.    

     

Null  Hypothesis  

Ho:  µ 1  =  µ 2  =  µ 3  

Alternative  Hypothesis  

Ha:  at  least  two  of  the  means  differ  

Assumptions  of  Test  

Small  sample  normal  distribution;  equal  variances      

Test  Statistic  

F*  =  17.7672  

Rejection  Region  

F.01,  2,  9  d.f  =  8.02    

Comparison  of  Test  Statistic   with  Rejection  Region  

F*  >  F.01,  2,  9  d.f          17.7672  >  8.02   Reject  Ho:  µ 1  =  µ 2  =  µ 3    

This  is  the  full  JMP  output  for  the  same  ANOVA,  including  some  summary  statistics  and  differences  of  means.    Once   we  establish  there  is  something  going  on  in  the  model  (at  least  one  mean  is  different),  we  should  ask  which  means  are   different.    Based  on  the  results,  we  can  see  that  Insecticide  2  and  3  are  both  different  from  insecticide  1,  but  they  are   not  significantly  different  from  each  other.    The  last  test,  using  Tukey-­‐Kramer’s  HSD,  is  given  at  the  bottom  of  the   output.   Oneway Analysis of Seedlings By Insecticide

Seedlings

90 80 70 60 50 1

2

3

Insecticide

All Pairs Tukey-Kramer 0.05

Oneway Anova Summary of Fit Rsquare Adj Rsquare Root Mean Square Error Mean of Response Observations (or Sum Wgts)

0.798 0.753 7.180 75.000 12.000

Analysis of Variance Source Insecticide Error C. Total

DF 2 9 11

Sum of Squares Mean Square 1832.00 916.000 464.00 51.556 2296.00

F Ratio 17.7672

Prob > F 0.0007*

Means for Oneway Anova Level Number Mean Std Error Lower 95% Upper 95% 1 4 58.0000 3.5901 49.879 66.121 2 4 87.0000 3.5901 78.879 95.121 3 4 80.0000 3.5901 71.879 88.121 Std Error uses a pooled estimate of error variance

Means and Std Deviations Level 1 2 3

Number 4 4 4

Mean 58.0000 87.0000 80.0000

Std Dev 7.83156 7.78888 5.71548

Std Err Mean Lower 95% Upper 95% 3.9158 45.538 70.462 3.8944 74.606 99.394 2.8577 70.905 89.095

Means Comparisons Comparisons for all pairs using Tukey-Kramer HSD Confidence Quantile q* 2.79201

Alpha 0.05

LSD Threshold Matrix Abs(Dif)-HSD 2 2 -14.176 3 -7.176 1 14.824

3 -7.176 -14.176 7.824

1 14.824 7.824 -14.176

Positive values show pairs of means that are significantly different.

Connecting Letters Report Level Mean 2 A 87.000000 3 A 80.000000 1 B 58.000000 Levels not connected by same letter are significantly different.

Ordered Differences Report Level 2 3 2

- Level 1 1 3

Difference Std Err Dif 29.00000 5.077182 22.00000 5.077182 7.00000 5.077182

Lower CL 14.8245 7.8245 -7.1755

Upper CL 43.17554 36.17554 21.17554

p-Value 0.0008* 0.0048* 0.3911

  The  results  of  the  multiple  comparisons  (3  comparisons,  insecticide  1  to  2,  1  to  3,  and  2  to  3)  indicate  there  is  a   significant  difference  between  insecticides  1  and  2,  as  well  as  1  and  3.    Insecticide  2  and  3  both  yield  a  significantly   higher  number  of  seedlings  compared  with  insecticide  1.    However,  there  is  no  significant  difference  between   insecticides  2  and  3.    We  can  tell  this  from  either  of  these  two  aspects  of  the  report:   • Using  the  connecting  letters,  levels  2  and  3  have  an  A,  indicating  no  difference  between  these  two  levels,  but   level  1  has  a  single  B,  indicating  it  is  different  from  the  others.   • If  we  look  at  the  confidence  intervals,  the  interval  for  1  and  2  or  1  and  3  does  not  contain  zero,  while  the   interval  between  2  and  3  does  contain  zero.