ANOVA: Analysis of Variance Marc H. Mehlman
[email protected] University of New Haven
“The analysis of variance is (not a mathematical theorem but) a simple method of arranging arithmetical facts so as to isolate and display the essential features of a body of data with the utmost simplicity.” – Sir Ronald A. Fisher Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
1 / 34
Table of Contents
1
ANOVA: One Way Layout
2
Comparing Means
3
ANOVA: Two Way Layout
4
Chapter #12 R Assignment
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
2 / 34
ANOVA (analysis of variance) is for testing if the means of k different populations are equal when all the populations are independent, normal and have the same unknown variance. An ANOVA test compares the randomness (variance) within groups (populations) to the randomness between groups. To test if the means of all the populations are equal, one considers the ratio variance between groups variance within groups as a test statistic. A large ratio would indicate a difference between in means between the groups.
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
3 / 34
ANOVA: One Way Layout
ANOVA: One Way Layout
ANOVA: One Way Layout
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
4 / 34
ANOVA: One Way Layout
The Idea of ANOVA
The sample means for the three samples are the same for each set. The variation among sample means for (a) is identical to (b). The variation among the individuals within the three samples is much less for (b).
CONCLUSION: the samples in (b) contain a larger amount of variation among the sample means relative to the amount of variation within the samples, so ANOVA will find more significant differences among the means in (b) − assuming equal sample sizes here for (a) and (b). 7 − Note: larger samples will find more significant differences.
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
5 / 34
ANOVA: One Way Layout
Note: When k = 2, one usually uses the two–sample t test. However, ANOVA will give the same result. When k > 2, hypothesis testing two populations at a time does not work well. For instance, if one has four populations and each test is a significance level 0.05, then the significance level of all 42 = 6 tests would be 1 − (1 − 0.05)6 = 0.265. The ANOVA procedure is computationally intense - one usually uses a computer program.
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
6 / 34
ANOVA: One Way Layout
Assumptions for doing ANOVA 1
the populations are normal.
2
the populations have same (unknown) variance.
The above conditions are robust in the sense one can use ANOVA if the populations are approximately normal (otherwise the Kruskal–Wallis Test – a nonparametric test) and the population variances are approximately equal. Convention: Rule for establishing equal variance If the largest sample standard deviation is less than twice the smallest sample standard deviation, one can use ANOVA techniques under the assumption the variances are all the same. Some textbooks use four times the smallest sample variance instead of just twice. Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
7 / 34
ANOVA: One Way Layout
The Treatment or Factor is what differs between populations. Example A Blood pressure drug is administered to k populations in k different doses. One samples from each of the the k populations. dosage #1 X11 , · · · , X1n1 .. .. . . dosage #k Xk1 , · · · , Xknk
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
8 / 34
ANOVA: One Way Layout
Definition Let k nj N x¯j sj2 x¯
def
=
# of levels (populations)
def
=
sample size of random sample from j th population
def
=
n1 + n2 + · · · + nk = total number of random varibles
def
=
sample mean from j th population
def
=
sample variance from j th population
def
ni k 1 XX xij the grand mean = N
=
i=1 j=1
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
9 / 34
ANOVA: One Way Layout
Definition
SSTOT
=
ni k X X
(xij − x¯)2 = Sum of Squares Total
i=1 j=1
SSA SSE
def
=
Sum of Squares between levels
=
n1 (¯ x1 − x¯)2 + n2 (¯ x2 − x¯)2 + · · · + nk (¯ xk − x¯)2
def
=
Sum of Squares within the levels
=
(n1 − 1)s12 + (n2 − 1)s22 + · · · + (nk − 1)sk2
Theorem SSTOT = SSA + SSE . Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
10 / 34
ANOVA: One Way Layout
Definition MSA
def
=
def
=
MSE
def
Mean Squares between levels (groups) SSA n1 (¯ x1 − x¯)2 + n2 (¯ x2 − x¯)2 + · · · + nk (¯ xk − x¯)2 = . k −1 k −1
=
Mean Squares within the levels
=
pooled sample variance
=
Mean Squared Error (n1 − 1)s12 + (n2 − 1)s22 + · · · + (nk − 1)sk2 SSE = . N −k N −k
def
=
Theorem The Mean Square Error, MSE , is an unbiased estimator of σ 2 . Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
11 / 34
ANOVA: One Way Layout
Theorem (ANOVA F Test) To test H 0 : µ1 = · · · = µk
vs
HA : not H0
use test statistic F =
MSA ∼ F (k − 1, N − k) MSE
under H0 .
Not H0 ⇒ F large, so use right tail test.
One creates an ANOVA table: Source Between Within Total
df k −1 N −k N −1
SS SSA SSE SSTOT
MS MSA MSE
F MSA MSE
p P(F(k − 1, N − I ) ≥ f )
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
12 / 34
ANOVA: One Way Layout
Example Judges at the Parisian photography contest, FotoGras, numerically scored photographs submitted by a number of photographers on a scale 0–10. A One–Way Anova Test was performed to see which type of camera the photograph was taken with had anything to do with the judges numerical scores. A summary of the data is given below: Brand Canon Nikon Pentax Samsung Sony
Sample Size 11 9 5 3 8
Sample Mean 7.6 8.0 8.7 8.3 8.0
Sample Variance 2.1 3.3 2.9 2.0 1.9
The scores awarded from each brand was verified as being (mostly) normally distributed and independent from the scores awarded from other brands. Create an ANOVA Table from the scores and decide whether there was no “brand effect” at a 0.05 significance level. Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
13 / 34
ANOVA: One Way Layout Example (cont.) Solution: √ √ Since the largest sample standard deviation, 3.3, is less than twice the size of the smallest sample variance, 1.9, we can assume the population variances are all the same. k
=
5
N
=
11 + 9 + 5 + 3 + 8 = 36
x¯
=
11(7.6) + 9(8.0) + 5(8.7) + 3(8.3) + 8(8.0) 36 2
2
= 8.0 2
2
2
SSA
=
11(7.6 − 8.0) + 9(8.0 − 8.0) + 5(8.7 − 8.0) + 3(8.3 − 8.0) + 8(8.0 − 8.0) = 4.48
SSE
=
(11 − 1)2.1 + (9 − 1)3.3 + (5 − 1)2.9 + (3 − 1)2.0 + (8 − 1)1.9 = 76.3
SSTOT
=
SSG + SSE = 4.48 + 76.3 = 80.78
MSA
=
MSE
=
f
=
p–value
=
SSA k−1 SSE N−k MSA MSE
= = =
4.48 5−1 76.3
= 1.12
36 − 5 1.12 2.46129
= 2.46129 = 0.4550459
P(F(4, 31) ≥ f ) = 0.7679706 Source Between Within Total
df 4 31 35
SS 4.48 76.3 80.78
MS 1.12 2.46129
F 0.45505
p 0.76797
One accepts the hypothesis that there is no “brand” effect. Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
14 / 34
ANOVA: One Way Layout
Example Given data on carpet durability > cdat=read.table("carpet.dat",h=TRUE) > cdat Durability Carpet 18.95 1 12.62 1 11.94 1 14.42 1 10.06 2 7.19 2 7.03 2 14.66 2 10.92 3 13.28 3 14.52 3 12.51 3 10.46 4 21.40 4 18.10 4 22.50 4
Test if durability depends on which carpet type one choses. Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
Marc Mehlman
15 / 34
ANOVA: One Way Layout
Example (continued) > cdat=read.table("carpet.dat",h=TRUE) > Carpet.F = as.factor(cdat$Carpet) # change to a categorical variable > g.lm=lm(cdat$Durability~Carpet.F) > anova(g.lm) Analysis of Variance Table Response: cdat$Durability Df Sum Sq Mean Sq F value Pr(>F) Carpet.F 3 146.374 48.791 3.5815 0.04674 * Residuals 12 163.477 13.623 --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1
1
> kruskal.test(cdat$Durability~Carpet.F) # Kruskal--Wallis Test Kruskal-Wallis rank sum test data: cdat$Durability by Carpet.F Kruskal-Wallis chi-squared = 5.2059, df = 3, p-value = 0.1573
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
16 / 34
Comparing Means
Comparing Means
Comparing Means
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
17 / 34
Comparing Means
If H0 is rejected, ie all means are not equal, how do you find how the population means differ from each other? Answer: boxplots (all in one graph). multiple comparison methods such as the Bonferroni Multiple Comparison Test. contrasts. Contrasts can ony be used when there are clear expectations before the starting the experiment. Contrasts are planned comparisons. Multiple comparisons should be used when there are no justified expectations. These are pair–wise tests of significance. Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
18 / 34
Comparing Means
Continuing with the carpet durability example, using R one can create boxplots:
10
15
20
> boxplot(cdat$Durability[1:4], cdat$Durability[5:8], cdat$Durability[9:12], cdat$Durability[13:16])
1
2
3
4
It seems that type 4 carpet is the most durable and type 2 is the least durable, but both of these types have more variably in durability than types 1 and 3. One should be careful about how strongly we use the word “seems” as we used only four carpets of each type. Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
19 / 34
Comparing Means
Definition A least significant differences (LDS) method is a multiple–comparisons procedure that tests each pair of levels and rejects H0 : µ1 = · · · = µk if any of the k2 tests is significant. The Bonferroni Multiple Comparison Test is a LDS method. Theorem (Bonferroni Multiple Comparison Test) To test H0 at the α significance level for every 1 ≤ i < j ≤ k: Step #1 calculate the test statistic tij = r
x¯j − x¯i MSE n1i +
1 nj
∼ t(N − k).
Step #2 Test whether the means of levels i and j are equal at the
α
(k2)
level using
the a two–sided test with the test statistic tij . k If any of the 2 test are significant, reject H0 . Otherwise accept H0 . Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
20 / 34
Comparing Means
Example > pairwise.t.test(cdat$Durability, Carpet.F, "bonferroni") Pairwise comparisons using t tests with pooled SD data:
cdat$Durability and Carpet.F
1 2 3 2 0.564 3 1.000 1.000 4 1.000 0.045 0.388 P value adjustment method: bonferroni
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
21 / 34
Comparing Means
Theorem (Simultaneous Confidence Intervals for Differences between Means) Simultaneous confidence intervals for all differences µi − µj between population means have the form s 1 1 ?? (¯ xi − x¯j ) ± t MSE + , ni nj where t ?? is the critical value from ∼ t(N − k) at the
α
(k2)
level.
Equivalent to the Bonferroni Multiple Comparison Test is to look at all the simultaneous confidence intervals and if even one does not have 0 in it, reject the hypothesis that all the population means are the same.
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
22 / 34
Comparing Means
Definition Pk A contrast is a linear combination of population means, Ψ = j=1 aj µj where Pk Pk the j=1 aj = 0. The corresponding sample contrast is C = j=1 aj x¯j The standard error of C is defined to be v v u k 2 u k X X u u a aj2 def j t = tMSE SEC = sp nj nj j=1
j=1
Theorem To test the H0 : Ψ = 0, use the the test statistic t=
C ∼ t(N − k). SEC
The alternative hypotheses can be one or two–sided. A (1 − α)100% confidence level for the value of Ψ is c ± t ∗ (N − k)SEC . Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
23 / 34
Comparing Means
Example Continuing with the carpet data, one tests if the average durability of the first two carpet types from the average durability of the last two carpet types. We will chose a1 = −0.5, a2 = −0.5, a3 = 0.5, a4 = 0.5 for our contrast. We already know MSE = 13.62 from the ANOVA table we calculated. Using T ∼ t(16 − 4)) gives c
=
−0.5(14.4825) − 0.5(9.735) + 0.5(12.8075) + 0.5(18.115) = 3.3525 q 13.62((−0.5)2 /4 + (−0.5)2 /4 + 0.52 /4 + 0.52 /4) = 1.845264
SEC
=
t
=
3.3525/1.845264 = 1.816813
2P(t ≥ T )
=
0.09428971.
Thus the p–value of H0 : Ψ = 0 versus HA : Ψ 6= 0 is 0.09428971. The calculations can all be done internally in R: > > > > > >
install.packages("gmodels") library(gmodels) cdat=read.table("carpet.dat",h=TRUE) Carpet.F = as.factor(cdat$Carpet) g.lm=lm(cdat$Durability~Carpet.F) fit.contrast(g.lm, "Carpet.F", c(-0.5, -0.5, 0.5, 0.5)) Estimate Std. Error t value Pr(>|t|) Carpet.F c=( -0.5 -0.5 0.5 0.5 ) 3.3525 1.845472 1.816609 0.09432261 Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
24 / 34
ANOVA: Two Way Layout
ANOVA: Two Way Layout
ANOVA: Two Way Layout
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
25 / 34
ANOVA: Two Way Layout
Same assumptions as before plus 1
Treatment A has I levels.
2
Treatment B has J levels.
3
a balanced design, i.e. all sample sizes = K (the same).
One is interested in: 1
is there an effect for the treatment A?
2
is there an effect for the treatment B?
3
is there an effect for interaction of treatments?
One can’t answer 3 if sample size = 1. Two–way ANOVA is more efficent than doing two one–way ANOVA’s plus it tells us information about the interaction of the two factors. Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
26 / 34
ANOVA: Two Way Layout
Definition Here SSA SSB SSAB SSE SSTOT
def
=
def
=
def
=
def
=
def
=
Sum of Squares of for Treatment A Sum of Squares of for Treatment B Sum of Squares of Non–Additive part Sum of Squares within treatments Total Sum of Squares
A and B are the two main effects from each of the two factors, and AB represents the interaction of factors A and B. Theorem SSTOT = SSA + SSB + SSAB + SSE . Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
27 / 34
ANOVA: Two Way Layout
Definition MSA MSB MSAB MSE
def
=
def
=
def
=
def
=
SSA = Mean Squares of Treatment A I −1 SSB = Mean Squares of Treatment B J −1 SSAB = Mean Squares of Non–Additive part (I − 1)(J − 1) SSE = Mean Squares within treatments N − IJ
Theorem MSE is an unbiased estimator of the population variance, σ 2 .
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
28 / 34
ANOVA: Two Way Layout
One creates a Two–Way ANOVA Table: Source
df
SS
MS
Treatment A
I −1
SSA
MSA
Treatment B
J−1
SSB
MSB
Interaction
(I − 1)(J − 1)
SSAB
MSAB
Error Total
N − IJ N−1
SSE SSTOT
MSE
F MSA MSE MSB MSE MSAB MSE
p P(F(I − 1, N − IJ) ≥ observed F) P(F(J − 1, N − IJ) ≥ observed F) P(F((J − 1)(I − 1), N − IJ) ≥ observed F)
Here The p–value in the first row is for a test of H0 : there is no effect for treatment A versus HA : there is an effect. The p–value in the second row is for a test of H0 : there is no effect for treatment B versus HA : there is an effect. The p–value in the third row is for a test of H0 : there is no non–additive interactive effect for treatments A and B versus HA : there is an effect. Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
29 / 34
ANOVA: Two Way Layout
Example Given data on carpet durability > cdat=read.table("carpet.dat",h=TRUE) > cdat Durability Carpet Composition 18.95 1 A 12.62 1 B 11.94 1 A 14.42 1 B 10.06 2 A 7.19 2 B 7.03 2 A 14.66 2 B 10.92 3 A 13.28 3 B 14.52 3 A 12.51 3 B 10.46 4 A 21.40 4 B 18.10 4 A 22.50 4 B
Test if durability depends on which carpet and which composition one choses. Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
30 / 34
ANOVA: Two Way Layout
Example > cdat=read.table("carpet.dat",h=TRUE) > Carpet.F=as.factor(cdat$Carpet) > Composition.F=as.factor(cdat$Composition) > gc3=lm(Durability~Carpet.F+Composition.F+Carpet.F:Composition.F,data=cdat) > anova(gc3) Analysis of Variance Table Response: Durability Df Sum Sq Mean Sq F value Pr(>F) Carpet.F 3 146.374 48.791 4.0981 0.04912 * Composition.F 1 17.222 17.222 1.4466 0.26347 Carpet.F:Composition.F 3 51.007 17.002 1.4281 0.30462 Residuals 8 95.247 11.906 --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
31 / 34
Chapter #12 R Assignment
Chapter #12 R Assignment
Chapter #12 R Assignment
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
32 / 34
Chapter #12 R Assignment
Enter the following in R to create the data.frame, “data”, that contains one factor with three levels. > > > > > >
y1 = c(18.2, 20.1, 17.6, y2 = c(17.4, 18.7, 19.1, y3 = c(15.2, 18.8, 17.7, y = c(y1, y2, y3) group = rep(1:3, c(7, 7, data = data.frame(y = y,
16.8, 18.8, 19.7, 19.1) 16.4, 15.9, 18.4, 17.7) 16.5, 15.9, 17.1, 16.7) 7)) group = factor(group))
1
Do a qqnorm plot for y 1, y 2 and y 3 to check for normality.
2
Check to see if one can assume the population variances are all equal.
3
Make a boxplot showing y 1, y 2 and y 3.
4
Create a ANOVA Table.
5
Give the results of the Kruskal–Wallis test.
Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
33 / 34
Chapter #12 R Assignment
The data file “data2way.csv”, found on math.newhaven.edu/marcmehlman/courses/estat/items.html,
contains a hypothetical sample of 27 participants who are divided into three stress reduction treatment groups (mental, physical and medical) and three age groups (young, mid, and old). The stress reduction values are represented on a scale that ranges from 0 to 10. Read this data into R using data2way = read.csv("data2way.csv") and use for the following problems. 6
7
8 9
Consider a test that the treatments have no effect on stress versus there is an effect. What is the p–value of this test. Consider a test that age has no effect on stress versus there is an effect. What is the p–value of this test. What is SSTOT ? What is the degrees of freedom for SSTOT ? Marc Mehlman
Marc Mehlman (University of New Haven)
ANOVA: Analysis of Variance
34 / 34