One-way ANOVA • Data situation: Continuous, dependent variable for two or more independent groups of cases. This is an extension of the t-test to the multiple-group situation. • Research question: Are the means of the groups equivalent? E.g., do the experimental, placebo and control groups perform equally well?
One-way ANOVA Procedure: 1. Calculate the means and standard deviations for the dependent variable for the k groups. 2. Compute the F-ratio 3. Evaluate F-ration against the critical value with appropriate degrees of freedom. –
Formulae:
H 0 : µ1 = ... = µ k or ,
H0 : µ j = µ ∀ j
H1 : µ j ≠ µ for some j
1
Compute F-ratio •
Assume k groups with sample sizes
n1 , n2 K , nk . 1. Calculate the means and variances:
Y j and S j
2
2. Pool the variances to calculate the Error Means Square:
(n −1)S ∑ MSE =
2 j
N−k
where N =
∑n
is the total sample size
j
k
is the number of groups
€
Compute F-ratio 3. Using the sample means, calculate the Between-Groups Mean Square: MSB = [∑ n j (Y j − Y ) 2 ] /(k − 1)
where
Y=
∑n
j
•Y j
N
is the overall mean of all scores (grand mean); 4. compute the F-ratio:
€
F=
MSB MSE
€ 2
Evaluate F-ratio Find the tabular critical value of F with numerator degrees of freedom k-1 and denominator degrees of freedom N-k. Reject Ho if the computed F-ratio exceeds the tabular value; otherwise, fail to reject Ho. Note that rejecting the null hypothesis suggests that the group means are not all equal, but does not tell you which specific means differ from each other.
One-way ANOVA • Assumptions (same as T-test) – This test assumes homogeneity of variance 2 (i.e., σ j = σ 2 ∀ j ) in order to justify the pooling of variances from the k samples. In practice, this may not be a realistic assumption. A robust alternative (and recommended) is the Brown and Forsyth F * statistic: 2
F
*
∑ n (Y − Y ) = ∑ (1 − n / N ) S j
j
j
2 j
3
2
F
*
∑ n (Y − Y ) = ∑ (1 − n / N ) S j
j
j
2 j
with degrees of freedom equal to k – 1 and f, where: f = 1 /[ c 2 /( n − 1)]
∑
cj =
j
j
(1 − n j / N ) S j
∑ (1 − n
j
2
/ N )S j
2
One-way ANOVA • Example: – Consider the following summary data for 3 groups each containing 9 observations:
Group 1 2 3
n 9 9 9
Mean 8.33 3.33 5.33
Variance 10.00 10.00 7.75
4
Step 1: State the hypotheses H0: µ1= µ2= µ3 H1: µ1 ≠ µ2 ≠ µ3 Step 2: Set the criterion • Two-tail test • α=0.05 • To calculate Fcrit we need df df numerator = k −1 = 3 −1 = 2 df denomnator = N − k = 27 − 3 = 24
€
Step 3: Get sample data
Group 1 2 3
n 9 9 9
Mean 8.33 3.33 5.33
Variance 10.00 10.00 7.75
5
Step 4: Calculate F statistic
The pooled variance is MSE =
10.00 + 10.00 + 7.75 = 9.25 3
NOTE: since the sample sizes are equal, the variances can simply be averaged.
€
The between-groups variance is: Y=
∑n
j
•Y j
Y=
N
9(8.33) + 9(3.33) + 9(5.33) = 5.66 27
MSB = [∑ n j (Y j − Y ) 2 ] /(k − 1) 9(8.33 − 5.66) 2 + 9(3.33 − 5.66) 2 + 9(5.33 − 5.66) 2 MSB = € = 57.00 3 −1
€
€
One-way ANOVA • Finally, the F-ratio is F=
€
MSB MSE
Fobt =
57.00 = 6.16 9.25
• The tabular critical value for F is 3.40 € • Do we reject or fail to reject?
6
One-way ANOVA • Estimates variance in two ways • Compute the ratio of the two estimates called the F-ratio or F-statistic • If the null is true and assumptions are met, the sampling distribution of this statistic is known • The statistic is called F and so is the family of distributions that underlies this statistic
One-way ANOVA • Estimation of variances: – Variances are estimated from the individual observations and from the group (or cell) means. – Each of these variances is estimated by a ‘mean square” (MSE and MSB) – which is based on appropriate sums of squares divided by appropriate degrees of freedom.
7
One-way ANOVA • Degrees of freedom: – Total: N – 1; due to estimation of the grand mean – MSB: k – 1 – MSE: N – k Expected value: F is a ratio of MSB to MSE – what is the expected value of F if the null hypothesis is true?
One-way ANOVA Sampling Distribution: • The sampling distribution of F under the null hypothesis is well-understood • What would the sampling distribution of F consist of? A probability distribution of F values - allows inferences about the probability of ranges of Fvalues • F belongs to a family of distributions. Like the t, the different F distributions are identified by different degrees of freedom. However, each F distribution has two degrees of freedom instead of one: Numerator (d.f. for MSB), Denominator (d.f. for MSE).
8
One-way ANOVA Making inferences: • If the null is true, the sampling distribution of F is known. Thus, we know the probability of any particular range of values for F. Can thus construct a “region of rejection” for values of F. That is, a region in which extreme values are improbable. For example, the top 5% of the distribution. If we obtain an F-value in the region of rejection (or, a p-value less than alpha), what do we do? Reject the null. What can we conclude if we reject the null?
One-way ANOVA Table Source
SS
df
MS
F
Between SSB Error SSE Total SSB+SSE
k-1 N-k N-1
SSB/(k-1) SSE/(N-k)
MSB/MSE
From Example
Source
SS
df
MS
F
Between Error Total
114 222 336
2 24 26
57 9.25
6.16
SSTOTAL = SSB + SSE = s2B + s2W dfTOTAL = dfB + dfE = dfB + dfW
9
One-way ANOVA • • • •
F(2,24) = 6.16>Fcrit, p < .05 Reject null hypothesis What does this mean? Interpret the results!
One-way ANOVA • Estimated omega squared (effect size). Provides an estimate of the proportion of the variability in the d.v. accounted for by the i.v.. Values of .15-.20 indicate a strong effect. Values of .01-.02 indicate a weak effect.
Formula:
SSb − (k − 1) MSE ω = SStot + MSE 2
10
One-way ANOVA • Example:
ω2 =
114 − (2)9.25 95.5 = = .28 336 + 9.25 345.25
Book Formulae
Formulae
∑n MSB =
(Y j − Y ) 2
j
sB2 =
k −1
Y=
∑n
j
•Y j
N
∑ (n −1)S MSE = €
N−k
€
2 j
€
sW2 =
SSW df W
all scores
SSW =
df E = N − k
2
€
MSB F= MSE df B = k −1€
SSB = ∑ n j (Y j − Y )
all scores 2 ∑ X (∑ X ) 2 (∑ X ) 2 (∑ X k ) 2 1 2 SSB = + + ...+ − n2 n k N n1
€
€
SSB df B
€
∑
(∑ X ) 2 (∑ X ) 2 (∑ X k ) 2 1 2 X2 − + + ...+ n2 n k n1
sB2 F= 2 sW
df B = k −1 dfW = N − k
€ € €
€
€
€
11
Example A researcher wants to identify the effects of sleep deprivation on test performance in his introductory statistics course. Three groups of 9 students will perform under one of the following conditions: – – –
High Sleep Deprivation: stay awake for 2 nights and days. Low Sleep Deprivation: sleep 4 hours each night for a total of 2 nights and days. Control: sleep normally for 2 nights and days.
After the 2-day period, the students are tested. The tests are scored.
Step 1: State the hypotheses H0: µ1= µ2= µ3 H1: µ1 ≠ µ2 ≠ µ3 Step 2: Set the criterion • Two-tail test • α=0.05 • To calculate Fcrit we need df df numerator = k −1 = 3 −1 = 2 df denomnator = N − k = 27 − 3 = 24
€ 12
Step 3: Get sample data Hi Sleep Deprivation Condition 1 81 80 72 82 83 89 76 88 83
n1 =9 X1 =81.56 s1 = 5.32
Lo Sleep Deprivation Normal Sleep Condition 2 Condition 3 92 86 86 93 87 97 76 81 80 94 87 89 92 98 83 90 84 91
n2 =9 X2 =85.22 s2 = 5.21
n3 =9 X3 =91 s3 = 5.34
Step 4: Calculate F statistic n1 =9 X1 =81.56
Y=
€
∑n
•Y j
j
Y=
N
∑n MSB =
j
n2 =9 X2 =85.22
n3 =9 X3 =91
9(81.56) + 9(85.22) + 9(91) = 85.93 27
(Y j − Y ) 2
€ k −1 MSB =
9(81.56 − 85.93) 2 + 9(85.22 − 85.93) 2 + 9(91− 85.93) 2 3 −1
€ MSB = €
407.75 = 203.88 2
€ 13
Step 4: Calculate F statistic
SUM
X1
X2
X3
81 80 72 82 83 89 76 88 83
92 86 87 76 80 87 92 83 84
86 93 97 81 94 89 98 90 91
734
767
all scores 2 ∑ X (∑ X ) 2 (∑ X ) 2 (∑ X k ) 2 1 2 − SSB = + + ...+ n2 n k N n1
819
€ SUMM ALL SCORES
2320
(734) 2 (767) 2 (819) 2 (2320) 2 SSB = + + = 408.07 − 9 9 27 9
sB2 =€
SSB df B
sB2 =
408.07 = 204.04 3 −1
€
€ Step 4: Calculate F statistic n1 =9 s1 = 5.32
∑ (n −1)S MSE =
n2 =9 s2 = 5.21
n3 =9 s3 = 5.34
2 j
N−k
MSE =
(8)(5.32) 2 + (8)(5.21) 2 + (8)(5.34) 2 = 27.99 27 − 3
MSE =
671.77 = 27.99 24
€
€ €
14
Step 4: Calculate F statistic
SUM
X12
X22
X2
X3
81 80 72 82 83 89 76 88 83
92 86 87 76 80 87 92 83 84
86 93 97 81 94 89 98 90 91
6561 6400 5184 6724 6889 7921 5776 7744 6889
8464 7396 7569 5776 6400 7569 8464 6889 7056
7396 8649 9409 6561 8836 7921 9604 8100 8281
734
767
819
60088
65583
74757
SUM ALL SQUARED SCORES all scores
∑
SSW =
€
X32
X1
(∑ X ) 2 (∑ X ) 2 (∑ X k ) 2 1 2 X2 − + + ...+ n2 n k n1
(734) 2 (767) 2 (819) 2 SSW = 200428 − + + = 671.78 9 9 9
SSW df W
sW2 =
200428
sw2 =
€
671.78 = 27.99 27 − 3
€
€
Step 4: Calculate F statistic
MSB =
407.75 = 203.88 2
sB2 =
408.07 = 204.04 3 −1
MSE =
671.77 = 27.99 24
sw2 =
671.78 = 27.99 27 − 3
F=
204.04 = 7.29 27.99
€ F=
€
Source € Between Error Total
203.88 € = 7.28 27.99
SS 408.07 671.78 1079.85
€
df
2 € 24 26
MS
F
204.04 27.99
7.29
SSTOTAL = SSB + SSE = s2B + s2W dfTOTAL = dfB + dfE = dfB + dfW
15
One-way ANOVA • Finally, the F-ratio is F=
204.04 = 7.29 27.99
• The tabular critical value for F with α=.05
df numerator = k −1 = 3 −1 = 2 df denomnator = N − k = 27 − 3 = 24
€
• Do we reject or fail to reject?
€
One-way ANOVA • • • •
F(2,24) = 7.29>Fcrit, p < .05 Reject null hypothesis What does this mean? Interpret the results!
16
• What is the effect size? • Is it a weak or strong effect? Formula:
SSb − (k − 1) MSE ω = SStot + MSE 2
17