One-way ANOVA. One-way ANOVA

One-way ANOVA • Data situation: Continuous, dependent variable for two or more independent groups of cases. This is an extension of the t-test to the ...
Author: Basil Perkins
3 downloads 3 Views 1MB Size
One-way ANOVA • Data situation: Continuous, dependent variable for two or more independent groups of cases. This is an extension of the t-test to the multiple-group situation. • Research question: Are the means of the groups equivalent? E.g., do the experimental, placebo and control groups perform equally well?

One-way ANOVA Procedure: 1. Calculate the means and standard deviations for the dependent variable for the k groups. 2. Compute the F-ratio 3. Evaluate F-ration against the critical value with appropriate degrees of freedom. –

Formulae:

H 0 : µ1 = ... = µ k or ,

H0 : µ j = µ ∀ j

H1 : µ j ≠ µ for some j

1

Compute F-ratio •

Assume k groups with sample sizes

n1 , n2 K , nk . 1. Calculate the means and variances:

Y j and S j

2

2. Pool the variances to calculate the Error Means Square:

(n −1)S ∑ MSE =

2 j

N−k

where N =

∑n

is the total sample size

j

k

is the number of groups



Compute F-ratio 3. Using the sample means, calculate the Between-Groups Mean Square: MSB = [∑ n j (Y j − Y ) 2 ] /(k − 1)

where

Y=

∑n

j

•Y j

N

is the overall mean of all scores (grand mean); 4. compute the F-ratio:



F=

MSB MSE

€ 2

Evaluate F-ratio Find the tabular critical value of F with numerator degrees of freedom k-1 and denominator degrees of freedom N-k. Reject Ho if the computed F-ratio exceeds the tabular value; otherwise, fail to reject Ho. Note that rejecting the null hypothesis suggests that the group means are not all equal, but does not tell you which specific means differ from each other.

One-way ANOVA • Assumptions (same as T-test) – This test assumes homogeneity of variance 2 (i.e., σ j = σ 2 ∀ j ) in order to justify the pooling of variances from the k samples. In practice, this may not be a realistic assumption. A robust alternative (and recommended) is the Brown and Forsyth F * statistic: 2

F

*

∑ n (Y − Y ) = ∑ (1 − n / N ) S j

j

j

2 j

3

2

F

*

∑ n (Y − Y ) = ∑ (1 − n / N ) S j

j

j

2 j

with degrees of freedom equal to k – 1 and f, where: f = 1 /[ c 2 /( n − 1)]



cj =

j

j

(1 − n j / N ) S j

∑ (1 − n

j

2

/ N )S j

2

One-way ANOVA • Example: – Consider the following summary data for 3 groups each containing 9 observations:

Group 1 2 3

n 9 9 9

Mean 8.33 3.33 5.33

Variance 10.00 10.00 7.75

4

Step 1: State the hypotheses H0: µ1= µ2= µ3 H1: µ1 ≠ µ2 ≠ µ3 Step 2: Set the criterion • Two-tail test • α=0.05 • To calculate Fcrit we need df df numerator = k −1 = 3 −1 = 2 df denomnator = N − k = 27 − 3 = 24



Step 3: Get sample data

Group 1 2 3

n 9 9 9

Mean 8.33 3.33 5.33

Variance 10.00 10.00 7.75

5

Step 4: Calculate F statistic

The pooled variance is MSE =

10.00 + 10.00 + 7.75 = 9.25 3

NOTE: since the sample sizes are equal, the variances can simply be averaged.



The between-groups variance is: Y=

∑n

j

•Y j

Y=

N

9(8.33) + 9(3.33) + 9(5.33) = 5.66 27

MSB = [∑ n j (Y j − Y ) 2 ] /(k − 1) 9(8.33 − 5.66) 2 + 9(3.33 − 5.66) 2 + 9(5.33 − 5.66) 2 MSB = € = 57.00 3 −1





One-way ANOVA • Finally, the F-ratio is F=



MSB MSE

Fobt =

57.00 = 6.16 9.25

• The tabular critical value for F is 3.40 € • Do we reject or fail to reject?

6

One-way ANOVA • Estimates variance in two ways • Compute the ratio of the two estimates called the F-ratio or F-statistic • If the null is true and assumptions are met, the sampling distribution of this statistic is known • The statistic is called F and so is the family of distributions that underlies this statistic

One-way ANOVA • Estimation of variances: – Variances are estimated from the individual observations and from the group (or cell) means. – Each of these variances is estimated by a ‘mean square” (MSE and MSB) – which is based on appropriate sums of squares divided by appropriate degrees of freedom.

7

One-way ANOVA • Degrees of freedom: – Total: N – 1; due to estimation of the grand mean – MSB: k – 1 – MSE: N – k Expected value: F is a ratio of MSB to MSE – what is the expected value of F if the null hypothesis is true?

One-way ANOVA Sampling Distribution: • The sampling distribution of F under the null hypothesis is well-understood • What would the sampling distribution of F consist of? A probability distribution of F values - allows inferences about the probability of ranges of Fvalues • F belongs to a family of distributions. Like the t, the different F distributions are identified by different degrees of freedom. However, each F distribution has two degrees of freedom instead of one: Numerator (d.f. for MSB), Denominator (d.f. for MSE).

8

One-way ANOVA Making inferences: • If the null is true, the sampling distribution of F is known. Thus, we know the probability of any particular range of values for F. Can thus construct a “region of rejection” for values of F. That is, a region in which extreme values are improbable. For example, the top 5% of the distribution. If we obtain an F-value in the region of rejection (or, a p-value less than alpha), what do we do? Reject the null. What can we conclude if we reject the null?

One-way ANOVA Table Source

SS

df

MS

F

Between SSB Error SSE Total SSB+SSE

k-1 N-k N-1

SSB/(k-1) SSE/(N-k)

MSB/MSE

From Example

Source

SS

df

MS

F

Between Error Total

114 222 336

2 24 26

57 9.25

6.16

SSTOTAL = SSB + SSE = s2B + s2W dfTOTAL = dfB + dfE = dfB + dfW

9

One-way ANOVA • • • •

F(2,24) = 6.16>Fcrit, p < .05 Reject null hypothesis What does this mean? Interpret the results!

One-way ANOVA • Estimated omega squared (effect size). Provides an estimate of the proportion of the variability in the d.v. accounted for by the i.v.. Values of .15-.20 indicate a strong effect. Values of .01-.02 indicate a weak effect.

Formula:

SSb − (k − 1) MSE ω = SStot + MSE 2

10

One-way ANOVA • Example:

ω2 =

114 − (2)9.25 95.5 = = .28 336 + 9.25 345.25

Book Formulae

Formulae

∑n MSB =

(Y j − Y ) 2

j

sB2 =

k −1

Y=

∑n

j

•Y j

N

∑ (n −1)S MSE = €

N−k



2 j



sW2 =

SSW df W

all scores

SSW =

df E = N − k

2



MSB F= MSE df B = k −1€

SSB = ∑ n j (Y j − Y )

 all scores  2  ∑ X   (∑ X ) 2 (∑ X ) 2 (∑ X k ) 2   1 2    SSB = + + ...+ − n2 n k  N  n1





SSB df B





 (∑ X ) 2 (∑ X ) 2 (∑ X k ) 2  1 2  X2 − + + ...+ n2 n k   n1

sB2 F= 2 sW

df B = k −1 dfW = N − k

€ € €







11

Example A researcher wants to identify the effects of sleep deprivation on test performance in his introductory statistics course. Three groups of 9 students will perform under one of the following conditions: – – –

High Sleep Deprivation: stay awake for 2 nights and days. Low Sleep Deprivation: sleep 4 hours each night for a total of 2 nights and days. Control: sleep normally for 2 nights and days.

After the 2-day period, the students are tested. The tests are scored.

Step 1: State the hypotheses H0: µ1= µ2= µ3 H1: µ1 ≠ µ2 ≠ µ3 Step 2: Set the criterion • Two-tail test • α=0.05 • To calculate Fcrit we need df df numerator = k −1 = 3 −1 = 2 df denomnator = N − k = 27 − 3 = 24

€ 12

Step 3: Get sample data Hi Sleep Deprivation Condition 1 81 80 72 82 83 89 76 88 83

n1 =9 X1 =81.56 s1 = 5.32

Lo Sleep Deprivation Normal Sleep Condition 2 Condition 3 92 86 86 93 87 97 76 81 80 94 87 89 92 98 83 90 84 91

n2 =9 X2 =85.22 s2 = 5.21

n3 =9 X3 =91 s3 = 5.34

Step 4: Calculate F statistic n1 =9 X1 =81.56

Y=



∑n

•Y j

j

Y=

N

∑n MSB =

j

n2 =9 X2 =85.22

n3 =9 X3 =91

9(81.56) + 9(85.22) + 9(91) = 85.93 27

(Y j − Y ) 2

€ k −1 MSB =

9(81.56 − 85.93) 2 + 9(85.22 − 85.93) 2 + 9(91− 85.93) 2 3 −1

€ MSB = €

407.75 = 203.88 2

€ 13

Step 4: Calculate F statistic

SUM

X1

X2

X3

81 80 72 82 83 89 76 88 83

92 86 87 76 80 87 92 83 84

86 93 97 81 94 89 98 90 91

734

767

 all scores  2  ∑ X   (∑ X ) 2 (∑ X ) 2 (∑ X k ) 2   1 2  − SSB =  + + ...+ n2 n k  N  n1

819

€ SUMM ALL SCORES

2320

(734) 2 (767) 2 (819) 2  (2320) 2 SSB =  + + = 408.07 − 9 9  27  9

sB2 =€

SSB df B

sB2 =

408.07 = 204.04 3 −1



€ Step 4: Calculate F statistic n1 =9 s1 = 5.32

∑ (n −1)S MSE =

n2 =9 s2 = 5.21

n3 =9 s3 = 5.34

2 j

N−k

MSE =

(8)(5.32) 2 + (8)(5.21) 2 + (8)(5.34) 2 = 27.99 27 − 3

MSE =

671.77 = 27.99 24



€ €

14

Step 4: Calculate F statistic

SUM

X12

X22

X2

X3

81 80 72 82 83 89 76 88 83

92 86 87 76 80 87 92 83 84

86 93 97 81 94 89 98 90 91

6561 6400 5184 6724 6889 7921 5776 7744 6889

8464 7396 7569 5776 6400 7569 8464 6889 7056

7396 8649 9409 6561 8836 7921 9604 8100 8281

734

767

819

60088

65583

74757

SUM ALL SQUARED SCORES all scores



SSW =



X32

X1

 (∑ X ) 2 (∑ X ) 2 (∑ X k ) 2  1 2  X2 − + + ...+ n2 n k   n1

(734) 2 (767) 2 (819) 2  SSW = 200428 −  + +  = 671.78 9 9   9

SSW df W

sW2 =

200428

sw2 =



671.78 = 27.99 27 − 3





Step 4: Calculate F statistic

MSB =

407.75 = 203.88 2

sB2 =

408.07 = 204.04 3 −1

MSE =

671.77 = 27.99 24

sw2 =

671.78 = 27.99 27 − 3

F=

204.04 = 7.29 27.99

€ F=



Source € Between Error Total

203.88 € = 7.28 27.99

SS 408.07 671.78 1079.85



df

2 € 24 26

MS

F

204.04 27.99

7.29

SSTOTAL = SSB + SSE = s2B + s2W dfTOTAL = dfB + dfE = dfB + dfW

15

One-way ANOVA • Finally, the F-ratio is F=

204.04 = 7.29 27.99

• The tabular critical value for F with α=.05

df numerator = k −1 = 3 −1 = 2 df denomnator = N − k = 27 − 3 = 24



• Do we reject or fail to reject?



One-way ANOVA • • • •

F(2,24) = 7.29>Fcrit, p < .05 Reject null hypothesis What does this mean? Interpret the results!

16

• What is the effect size? • Is it a weak or strong effect? Formula:

SSb − (k − 1) MSE ω = SStot + MSE 2

17

Suggest Documents