ANOVA: Analysis of Variance

ANOVA: Analysis of Variance Marc H. Mehlman [email protected] University of New Haven “The analysis of variance is (not a mathematical theorem bu...
2 downloads 1 Views 481KB Size
ANOVA: Analysis of Variance Marc H. Mehlman [email protected] University of New Haven

“The analysis of variance is (not a mathematical theorem but) a simple method of arranging arithmetical facts so as to isolate and display the essential features of a body of data with the utmost simplicity.” – Sir Ronald A. Fisher Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

1 / 34

Table of Contents

1

ANOVA: One Way Layout

2

Comparing Means

3

ANOVA: Two Way Layout

4

Chapter #12 R Assignment

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

2 / 34

ANOVA (analysis of variance) is for testing if the means of k different populations are equal when all the populations are independent, normal and have the same unknown variance. An ANOVA test compares the randomness (variance) within groups (populations) to the randomness between groups. To test if the means of all the populations are equal, one considers the ratio variance between groups variance within groups as a test statistic. A large ratio would indicate a difference between in means between the groups.

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

3 / 34

ANOVA: One Way Layout

ANOVA: One Way Layout

ANOVA: One Way Layout

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

4 / 34

ANOVA: One Way Layout

The Idea of ANOVA

  

The sample means for the three samples are the same for each set. The variation among sample means for (a) is identical to (b). The variation among the individuals within the three samples is much less for (b).

 CONCLUSION: the samples in (b) contain a larger amount of variation among the sample means relative to the amount of variation within the samples, so ANOVA will find more significant differences among the means in (b) − assuming equal sample sizes here for (a) and (b). 7 − Note: larger samples will find more significant differences.

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

5 / 34

ANOVA: One Way Layout

Note: When k = 2, one usually uses the two–sample t test. However, ANOVA will give the same result. When k > 2, hypothesis testing two populations at a time does not work well. For instance, if one has four populations and each  test is a significance level 0.05, then the significance level of all 42 = 6 tests would be 1 − (1 − 0.05)6 = 0.265. The ANOVA procedure is computationally intense - one usually uses a computer program.

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

6 / 34

ANOVA: One Way Layout

Assumptions for doing ANOVA 1

the populations are normal.

2

the populations have same (unknown) variance.

The above conditions are robust in the sense one can use ANOVA if the populations are approximately normal (otherwise the Kruskal–Wallis Test – a nonparametric test) and the population variances are approximately equal. Convention: Rule for establishing equal variance If the largest sample standard deviation is less than twice the smallest sample standard deviation, one can use ANOVA techniques under the assumption the variances are all the same. Some textbooks use four times the smallest sample variance instead of just twice. Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

7 / 34

ANOVA: One Way Layout

The Treatment or Factor is what differs between populations. Example A Blood pressure drug is administered to k populations in k different doses. One samples from each of the the k populations. dosage #1 X11 , · · · , X1n1 .. .. . . dosage #k Xk1 , · · · , Xknk

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

8 / 34

ANOVA: One Way Layout

Definition Let k nj N x¯j sj2 x¯

def

=

# of levels (populations)

def

=

sample size of random sample from j th population

def

=

n1 + n2 + · · · + nk = total number of random varibles

def

=

sample mean from j th population

def

=

sample variance from j th population

def

ni k 1 XX xij the grand mean = N

=

i=1 j=1

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

9 / 34

ANOVA: One Way Layout

Definition

SSTOT

=

ni k X X

(xij − x¯)2 = Sum of Squares Total

i=1 j=1

SSA SSE

def

=

Sum of Squares between levels

=

n1 (¯ x1 − x¯)2 + n2 (¯ x2 − x¯)2 + · · · + nk (¯ xk − x¯)2

def

=

Sum of Squares within the levels

=

(n1 − 1)s12 + (n2 − 1)s22 + · · · + (nk − 1)sk2

Theorem SSTOT = SSA + SSE . Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

10 / 34

ANOVA: One Way Layout

Definition MSA

def

=

def

=

MSE

def

Mean Squares between levels (groups) SSA n1 (¯ x1 − x¯)2 + n2 (¯ x2 − x¯)2 + · · · + nk (¯ xk − x¯)2 = . k −1 k −1

=

Mean Squares within the levels

=

pooled sample variance

=

Mean Squared Error (n1 − 1)s12 + (n2 − 1)s22 + · · · + (nk − 1)sk2 SSE = . N −k N −k

def

=

Theorem The Mean Square Error, MSE , is an unbiased estimator of σ 2 . Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

11 / 34

ANOVA: One Way Layout

Theorem (ANOVA F Test) To test H 0 : µ1 = · · · = µk

vs

HA : not H0

use test statistic F =

MSA ∼ F (k − 1, N − k) MSE

under H0 .

Not H0 ⇒ F large, so use right tail test.

One creates an ANOVA table: Source Between Within Total

df k −1 N −k N −1

SS SSA SSE SSTOT

MS MSA MSE

F MSA MSE

p P(F(k − 1, N − I ) ≥ f )

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

12 / 34

ANOVA: One Way Layout

Example Judges at the Parisian photography contest, FotoGras, numerically scored photographs submitted by a number of photographers on a scale 0–10. A One–Way Anova Test was performed to see which type of camera the photograph was taken with had anything to do with the judges numerical scores. A summary of the data is given below: Brand Canon Nikon Pentax Samsung Sony

Sample Size 11 9 5 3 8

Sample Mean 7.6 8.0 8.7 8.3 8.0

Sample Variance 2.1 3.3 2.9 2.0 1.9

The scores awarded from each brand was verified as being (mostly) normally distributed and independent from the scores awarded from other brands. Create an ANOVA Table from the scores and decide whether there was no “brand effect” at a 0.05 significance level. Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

13 / 34

ANOVA: One Way Layout Example (cont.) Solution: √ √ Since the largest sample standard deviation, 3.3, is less than twice the size of the smallest sample variance, 1.9, we can assume the population variances are all the same. k

=

5

N

=

11 + 9 + 5 + 3 + 8 = 36



=

11(7.6) + 9(8.0) + 5(8.7) + 3(8.3) + 8(8.0) 36 2

2

= 8.0 2

2

2

SSA

=

11(7.6 − 8.0) + 9(8.0 − 8.0) + 5(8.7 − 8.0) + 3(8.3 − 8.0) + 8(8.0 − 8.0) = 4.48

SSE

=

(11 − 1)2.1 + (9 − 1)3.3 + (5 − 1)2.9 + (3 − 1)2.0 + (8 − 1)1.9 = 76.3

SSTOT

=

SSG + SSE = 4.48 + 76.3 = 80.78

MSA

=

MSE

=

f

=

p–value

=

SSA k−1 SSE N−k MSA MSE

= = =

4.48 5−1 76.3

= 1.12

36 − 5 1.12 2.46129

= 2.46129 = 0.4550459

P(F(4, 31) ≥ f ) = 0.7679706 Source Between Within Total

df 4 31 35

SS 4.48 76.3 80.78

MS 1.12 2.46129

F 0.45505

p 0.76797

One accepts the hypothesis that there is no “brand” effect. Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

14 / 34

ANOVA: One Way Layout

Example Given data on carpet durability > cdat=read.table("carpet.dat",h=TRUE) > cdat Durability Carpet 18.95 1 12.62 1 11.94 1 14.42 1 10.06 2 7.19 2 7.03 2 14.66 2 10.92 3 13.28 3 14.52 3 12.51 3 10.46 4 21.40 4 18.10 4 22.50 4

Test if durability depends on which carpet type one choses. Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

Marc Mehlman

15 / 34

ANOVA: One Way Layout

Example (continued) > cdat=read.table("carpet.dat",h=TRUE) > Carpet.F = as.factor(cdat$Carpet) # change to a categorical variable > g.lm=lm(cdat$Durability~Carpet.F) > anova(g.lm) Analysis of Variance Table Response: cdat$Durability Df Sum Sq Mean Sq F value Pr(>F) Carpet.F 3 146.374 48.791 3.5815 0.04674 * Residuals 12 163.477 13.623 --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1

1

> kruskal.test(cdat$Durability~Carpet.F) # Kruskal--Wallis Test Kruskal-Wallis rank sum test data: cdat$Durability by Carpet.F Kruskal-Wallis chi-squared = 5.2059, df = 3, p-value = 0.1573

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

16 / 34

Comparing Means

Comparing Means

Comparing Means

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

17 / 34

Comparing Means

If H0 is rejected, ie all means are not equal, how do you find how the population means differ from each other? Answer: boxplots (all in one graph). multiple comparison methods such as the Bonferroni Multiple Comparison Test. contrasts. Contrasts can ony be used when there are clear expectations before the starting the experiment. Contrasts are planned comparisons. Multiple comparisons should be used when there are no justified expectations. These are pair–wise tests of significance. Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

18 / 34

Comparing Means

Continuing with the carpet durability example, using R one can create boxplots:

10

15

20

> boxplot(cdat$Durability[1:4], cdat$Durability[5:8], cdat$Durability[9:12], cdat$Durability[13:16])

1

2

3

4

It seems that type 4 carpet is the most durable and type 2 is the least durable, but both of these types have more variably in durability than types 1 and 3. One should be careful about how strongly we use the word “seems” as we used only four carpets of each type. Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

19 / 34

Comparing Means

Definition A least significant differences (LDS) method is a multiple–comparisons procedure  that tests each pair of levels and rejects H0 : µ1 = · · · = µk if any of the k2 tests is significant. The Bonferroni Multiple Comparison Test is a LDS method. Theorem (Bonferroni Multiple Comparison Test) To test H0 at the α significance level for every 1 ≤ i < j ≤ k: Step #1 calculate the test statistic tij = r

x¯j − x¯i  MSE n1i +

1 nj

 ∼ t(N − k).

Step #2 Test whether the means of levels i and j are equal at the

α

(k2)

level using

the a two–sided test with the test statistic tij .  k If any of the 2 test are significant, reject H0 . Otherwise accept H0 . Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

20 / 34

Comparing Means

Example > pairwise.t.test(cdat$Durability, Carpet.F, "bonferroni") Pairwise comparisons using t tests with pooled SD data:

cdat$Durability and Carpet.F

1 2 3 2 0.564 3 1.000 1.000 4 1.000 0.045 0.388 P value adjustment method: bonferroni

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

21 / 34

Comparing Means

Theorem (Simultaneous Confidence Intervals for Differences between Means) Simultaneous confidence intervals for all differences µi − µj between population means have the form s   1 1 ?? (¯ xi − x¯j ) ± t MSE + , ni nj where t ?? is the critical value from ∼ t(N − k) at the

α

(k2)

level.

Equivalent to the Bonferroni Multiple Comparison Test is to look at all the simultaneous confidence intervals and if even one does not have 0 in it, reject the hypothesis that all the population means are the same.

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

22 / 34

Comparing Means

Definition Pk A contrast is a linear combination of population means, Ψ = j=1 aj µj where Pk Pk the j=1 aj = 0. The corresponding sample contrast is C = j=1 aj x¯j The standard error of C is defined to be v v u k 2 u k X X u u a aj2 def j t = tMSE SEC = sp nj nj j=1

j=1

Theorem To test the H0 : Ψ = 0, use the the test statistic t=

C ∼ t(N − k). SEC

The alternative hypotheses can be one or two–sided. A (1 − α)100% confidence level for the value of Ψ is c ± t ∗ (N − k)SEC . Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

23 / 34

Comparing Means

Example Continuing with the carpet data, one tests if the average durability of the first two carpet types from the average durability of the last two carpet types. We will chose a1 = −0.5, a2 = −0.5, a3 = 0.5, a4 = 0.5 for our contrast. We already know MSE = 13.62 from the ANOVA table we calculated. Using T ∼ t(16 − 4)) gives c

=

−0.5(14.4825) − 0.5(9.735) + 0.5(12.8075) + 0.5(18.115) = 3.3525 q 13.62((−0.5)2 /4 + (−0.5)2 /4 + 0.52 /4 + 0.52 /4) = 1.845264

SEC

=

t

=

3.3525/1.845264 = 1.816813

2P(t ≥ T )

=

0.09428971.

Thus the p–value of H0 : Ψ = 0 versus HA : Ψ 6= 0 is 0.09428971. The calculations can all be done internally in R: > > > > > >

install.packages("gmodels") library(gmodels) cdat=read.table("carpet.dat",h=TRUE) Carpet.F = as.factor(cdat$Carpet) g.lm=lm(cdat$Durability~Carpet.F) fit.contrast(g.lm, "Carpet.F", c(-0.5, -0.5, 0.5, 0.5)) Estimate Std. Error t value Pr(>|t|) Carpet.F c=( -0.5 -0.5 0.5 0.5 ) 3.3525 1.845472 1.816609 0.09432261 Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

24 / 34

ANOVA: Two Way Layout

ANOVA: Two Way Layout

ANOVA: Two Way Layout

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

25 / 34

ANOVA: Two Way Layout

Same assumptions as before plus 1

Treatment A has I levels.

2

Treatment B has J levels.

3

a balanced design, i.e. all sample sizes = K (the same).

One is interested in: 1

is there an effect for the treatment A?

2

is there an effect for the treatment B?

3

is there an effect for interaction of treatments?

One can’t answer 3 if sample size = 1. Two–way ANOVA is more efficent than doing two one–way ANOVA’s plus it tells us information about the interaction of the two factors. Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

26 / 34

ANOVA: Two Way Layout

Definition Here SSA SSB SSAB SSE SSTOT

def

=

def

=

def

=

def

=

def

=

Sum of Squares of for Treatment A Sum of Squares of for Treatment B Sum of Squares of Non–Additive part Sum of Squares within treatments Total Sum of Squares

A and B are the two main effects from each of the two factors, and AB represents the interaction of factors A and B. Theorem SSTOT = SSA + SSB + SSAB + SSE . Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

27 / 34

ANOVA: Two Way Layout

Definition MSA MSB MSAB MSE

def

=

def

=

def

=

def

=

SSA = Mean Squares of Treatment A I −1 SSB = Mean Squares of Treatment B J −1 SSAB = Mean Squares of Non–Additive part (I − 1)(J − 1) SSE = Mean Squares within treatments N − IJ

Theorem MSE is an unbiased estimator of the population variance, σ 2 .

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

28 / 34

ANOVA: Two Way Layout

One creates a Two–Way ANOVA Table: Source

df

SS

MS

Treatment A

I −1

SSA

MSA

Treatment B

J−1

SSB

MSB

Interaction

(I − 1)(J − 1)

SSAB

MSAB

Error Total

N − IJ N−1

SSE SSTOT

MSE

F MSA MSE MSB MSE MSAB MSE

p P(F(I − 1, N − IJ) ≥ observed F) P(F(J − 1, N − IJ) ≥ observed F) P(F((J − 1)(I − 1), N − IJ) ≥ observed F)

Here The p–value in the first row is for a test of H0 : there is no effect for treatment A versus HA : there is an effect. The p–value in the second row is for a test of H0 : there is no effect for treatment B versus HA : there is an effect. The p–value in the third row is for a test of H0 : there is no non–additive interactive effect for treatments A and B versus HA : there is an effect. Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

29 / 34

ANOVA: Two Way Layout

Example Given data on carpet durability > cdat=read.table("carpet.dat",h=TRUE) > cdat Durability Carpet Composition 18.95 1 A 12.62 1 B 11.94 1 A 14.42 1 B 10.06 2 A 7.19 2 B 7.03 2 A 14.66 2 B 10.92 3 A 13.28 3 B 14.52 3 A 12.51 3 B 10.46 4 A 21.40 4 B 18.10 4 A 22.50 4 B

Test if durability depends on which carpet and which composition one choses. Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

30 / 34

ANOVA: Two Way Layout

Example > cdat=read.table("carpet.dat",h=TRUE) > Carpet.F=as.factor(cdat$Carpet) > Composition.F=as.factor(cdat$Composition) > gc3=lm(Durability~Carpet.F+Composition.F+Carpet.F:Composition.F,data=cdat) > anova(gc3) Analysis of Variance Table Response: Durability Df Sum Sq Mean Sq F value Pr(>F) Carpet.F 3 146.374 48.791 4.0981 0.04912 * Composition.F 1 17.222 17.222 1.4466 0.26347 Carpet.F:Composition.F 3 51.007 17.002 1.4281 0.30462 Residuals 8 95.247 11.906 --Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

31 / 34

Chapter #12 R Assignment

Chapter #12 R Assignment

Chapter #12 R Assignment

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

32 / 34

Chapter #12 R Assignment

Enter the following in R to create the data.frame, “data”, that contains one factor with three levels. > > > > > >

y1 = c(18.2, 20.1, 17.6, y2 = c(17.4, 18.7, 19.1, y3 = c(15.2, 18.8, 17.7, y = c(y1, y2, y3) group = rep(1:3, c(7, 7, data = data.frame(y = y,

16.8, 18.8, 19.7, 19.1) 16.4, 15.9, 18.4, 17.7) 16.5, 15.9, 17.1, 16.7) 7)) group = factor(group))

1

Do a qqnorm plot for y 1, y 2 and y 3 to check for normality.

2

Check to see if one can assume the population variances are all equal.

3

Make a boxplot showing y 1, y 2 and y 3.

4

Create a ANOVA Table.

5

Give the results of the Kruskal–Wallis test.

Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

33 / 34

Chapter #12 R Assignment

The data file “data2way.csv”, found on math.newhaven.edu/marcmehlman/courses/estat/items.html,

contains a hypothetical sample of 27 participants who are divided into three stress reduction treatment groups (mental, physical and medical) and three age groups (young, mid, and old). The stress reduction values are represented on a scale that ranges from 0 to 10. Read this data into R using data2way = read.csv("data2way.csv") and use for the following problems. 6

7

8 9

Consider a test that the treatments have no effect on stress versus there is an effect. What is the p–value of this test. Consider a test that age has no effect on stress versus there is an effect. What is the p–value of this test. What is SSTOT ? What is the degrees of freedom for SSTOT ? Marc Mehlman

Marc Mehlman (University of New Haven)

ANOVA: Analysis of Variance

34 / 34