Student s t-distribution. Sampling Distributions Redux

Student’s t-Distribution The t-Distribution, t-Tests, Measures of Effect Size, & Managing Violations of Assumptions Sampling Distributions Redux • Ch...
3 downloads 2 Views 332KB Size
Student’s t-Distribution The t-Distribution, t-Tests, Measures of Effect Size, & Managing Violations of Assumptions

Sampling Distributions Redux • Chapter 7 opens with a return to the concept of sampling distributions from chapter 4 – Sampling distributions of the mean

1

Sampling Distribution of the Mean • Because the SDotM is so important in statistics, you should understand it • The SDotM is governed by the Central Limit Theorem Given a population with a mean μ and a variance σ2, the sampling distribution of the mean (the distribution of sample means) will have a mean equal to μ, a variance equal to σ2/n, and a standard deviation equal to σ 2 / n . The distribution will approach the normal distribution as n, the sample size, increases. (p. 178)

Sampling Distribution of the Mean Translation: 1. For any population with a given mean and variance the sampling distribution of the mean will have: • • •

µx = μ σx2 = σ2/n σx = σ/√n

2. As n increases, the sampling distribution of the mean (µx) approaches a normal curve

2

Sampling Distribution of the Mean • Analysis: – Although µx and µ will tend to be similar to one another… – The relationships between… • σx2 and σ2 • σx and σ

– …will differ as a function of the sample size • We saw this in our sampling distribution of the mean example from chapter 4…

So, you wanna test a hypothesis, do ya? • Our understanding of sampling and sampling distributions now allows us to test hypotheses • How we test a hypothesis depends on the information we have available

3

Choosing a Test • µ?

1. Which variables are available?

– σ? – s?

• Number of data sets: –1 –2

2. How many data sets are you presented with? 3. Do your data sets come from 1 or 2 groups?

• Number of Groups –1 –2

Testing Hypotheses about Means: The Rare Case of Knowing σ • So far, to test the probability of finding a particular score, we’ve used the Standard Normal Distribution – IQ = 122 – µ = 100 – σ = 15

(x − x)

z= z=

σ (122 − 100) 15

z=

(22) 15

z = 1.47

-1.96 < z < 1.96 Fail to reject H0

4

How the z-Test Works • How does our test change when we test group means, not just individual scores? – We use the central limit theorem

How the z-Test Works n = 100

(122 − 100) 15 100 (122 − 100) z= 15 2 z=

n=2

n=1

z=

(122 − 100) 15 1

( 22) 15 10 ( 22) z= 15 1.41 z=

z=

( 22) 15 1

(22) 1.5

z = 14.67

(22) 10.64

z = 2.07

(22) 15

z = 1.47

z=

z=

z=

5

How the z-Test Works •

Large samples reduce the amount of random variance (sampling error) –

• •

More confidence that the sample mean = population mean

Larger samples improve our ability to detect differences between samples and populations For n = 1 (x − ) (x − µ) z= = z =

σ

µ

σ

n

Testing Hypotheses: When σ Is Unknown • Generally, the population standard deviation, σ, is unknown to us • Occasionally, we will know the population mean, µ, when we don’t know σ • In these situations, the standard normal distribution no longer meets our needs

6

Testing Hypotheses: When σ Is Unknown • Knowing µ… – We can produce an estimate of σ from s – Using s changes the nature of the test we are conducting, as s is not distributed in the same fashion as σ • Sampling distribution of the sample standard deviation is NOT normally distributed – Strong positive skew

Testing Hypotheses: When σ Is Unknown Sampling distribution of s

Sampling distribution of σ

7

So How Does s Estimate σ? • Given the differences in distribution shape, it is easy to conclude that s ≠ σ – s is an unbiased estimator of σ over repeated samplings – However, a SINGLE value of s is likely to underestimate σ • Because of this fact, small samples will systematically underestimate σ as a function of s

– This leads to any given statistic calculated from this distribution to be < a comparable value of z – We cannot use z any longer Æ t

t and the t-Distribution •

Developed by Student while he was working for the Guinness Brewing Co. 1. The shape of the t-distribution is a direct function of the size of the sample we are examining 2. For small samples, the t-distribution is somewhat flatter than the standard normal distribution, with a lower peak and fatter tails

8

t and the t-Distribution 3. As sample size increases: • •



The t-distribution approaches a normal distribution Theoretically, we mean that the closer that our sample comes to infinity, the more it looks like a normal distribution Practically, when n ~ 100 – 120

t and the t-Distribution

9

t and the t-Distribution 4. Identifying values of t associated with a given rejection region depends on: – α – the number of tails associated with the test – the degrees of freedom available in the analysis – For this one-sample test, (df = n-1) because we used one degree of freedom calculating s using the sample mean and not the population mean.

One-Sample t-Test ( x − µ) ( x − µ) t = ( x − µ) or or t = t= s 2 x sx sx n n

10

z-Test vs. One-Sample t-Test

z=

(x − µ)

σ

n

(x − µ) t= sx n

Note the similarities between these tests: ONLY the source of “variance” and the distribution you test against have changed!

Using the One-Sample t-Test • You are one the admissions board for a graduate school of Psychology. • You are attempting to determine if the GRE scores for the students applying to your program is competitive with the national average. – µVerbal = 569

• SPSS output from your data Descriptive Statistics N

Range

GRE

24

Valid N (listwise)

24

310.00

Mean 659.7917

Std. Deviation 86.43267

11

Using the One-Sample t-Test •

Research Hypothesis: – The GRE scores from your applicants differ from the population norms •



H1: µa ≠ µp

or

ES > 0

Null Hypothesis – The GRE scores from your applicants do not differ from the population norms •



H0: µa = µp

or

ES = 0

Evaluate the students’ GRE-V scores

Using the One-Sample t-Test •

Select: •

Rejection region •



α = .05

“Tail” or directionality •



We don’t know exactly how the students will score: we just expect them to show scores differing from the population values Might predict higher scores…

12

Using the One-Sample t-Test •

Generate sampling distribution of the mean assuming H0 is true •



One-Sample t-test

Given our sampling distribution: •

Conduct the statistical test

Using the One-Sample t-Test t=

t=

(x − µ) sx n

(659.79 − 569) 86.43 24

µVerbal = 569 x-bar = 659.79 s = 86.43 n = 24

t=

(90.79) 86.43 4.90

t=

(90.79) 17.64

t = 5.15

This numerical value is called tobt tobt(23) = 5.15

13

Using the One-Sample t-Test • SPSS Output

µ One-Sample Test Test Value = 569 95% Confidence Interval of the Difference

t GRE

5.146

df

Sig. (2-tailed) Mean Difference 23

.000

90.79167

Lower 54.2943

Upper 127.2890

tobt(23) = 5.15

Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: two-tailed – t-Value = 5.15 – Degrees of freedom (df) • For the One-Sample t-Test, df = n-1 (24-1 = 23) • Estimating s from x-bar (not σ from µ)

14

Evaluating Statistical Significance of the t-Test • In the past you… – Identified a tabled value of tcrit – Compare tcrit to our tobt value – If tobt falls into the rejection region identified by tcrit, then we reject H0 – If tobt does not fall into the rejection region identified by tcrit, then we fail to reject H0

• SPSS Simplifies matters by exactly calculating p for us

Using the One-Sample t-Test • SPSS Output

µ One-Sample Test Test Value = 569 95% Confidence Interval of the Difference

t GRE

5.146

df

Sig. (2-tailed) Mean Difference 23

.000

90.79167

Lower 54.2943

Upper 127.2890

tobt(23) = 5.15, p < .05 Exact probability ≈ .000003

15

Evaluating Statistical Significance of the t-Test tobt = 5.15

tcrit = - 2.069

tcrit = 2.069

0

Because tobt falls within the rejection region identified by tcrit we reject H0

Testing Hypotheses: Two Matched (Repeated) Samples • Sometimes, we’re interested in how a single set of scores change over time – – – –

Psychotherapy tx influences depression Patients respond to medication Consumer attitudes before and after an advertisement Changes in citizen attitudes following the State of the Union address

• When we look at two sets of scores collected from a single sample at different time points, we need to use a matched samples test

16

Matched Samples • Matched samples – Use the same participants at two or more different time points to collect similar data • MUST BE THE SAME SAMPLE! Time 1

Wait 30 Days

BDI - II

Time 2 BDI - II

Matched Samples Test • With a matched samples test, you are testing the change in scores between the two administrations of the test – H0: µ1 = µ2 – H0: µ1 - µ2 = 0

or

ES = 0

• This is truly the null hypothesis for the matched samples test

17

Matched Samples Test • Essentially, the group means at each time point mean little to us – Change in scores is the key – Conduct this test by obtaining the average difference score between the two time points

Matched Samples Test

D −0 t= sD n

D-bar represents average difference scores between time points

sD is the standard deviation of the difference scores -0 may seem redundant, but isn’t!

18

Calculating the Matched Samples t-Test • You are a researcher examining the impact of a new therapy intervention on the incidence of self-injurious behavior (SIB) • You collect a measure of the frequency of self-injurious acts when clients enter your treatment (time 1) • You collect a measure of the frequency of self-injurious acts two weeks later (time 2)

Calculating the Matched Samples t-Test •

Research Hypothesis: – The new treatment will change SIB scores •



H1: µ1 ≠ µ2

or

ES > 0

Null Hypothesis – The SIB scores at time 2 will be the same as the scores at time 1 (no change) • •



H0: µ1 = µ2 H0: µ1 - µ2 = 0

or

ES = 0

Evaluate SIB at time 1 & time 2

19

Using the One-Sample t-Test •

Select: •

Rejection region •



α = .05

“Tail” or directionality •

We don’t know exactly how the treatment will work, so we’d better use a two-tailed test

Using the One-Sample t-Test •

Generate sampling distribution of the mean assuming H0 is true •



Matched Samples t-test

Given our sampling distribution: •

Conduct the statistical test

20

Calculating the Matched Samples t-Test Time 1 13 14 8 10 11 13 15 16 19 10 7 Time 2

8 10 4

7 10 9 11 9 17 6

2

D

5

3

5

D2

25 16 16 9

4

4

1

4

4

7

2

4

1 16 16 49 4 16 25

∑D = 43

Descriptive Statistics

D = 3.91 ∑D2 = 193

N

Minimum

Maximum

Mean

Std. Deviation

time1

11

7.00

19.00

12.3636

3.58532

time2

11

2.00

17.00

8.4545

3.93354

Valid N (listwise)

11

(∑D)2 = 1849

Calculating the Matched Samples t-Test (∑ D ) 2 ∑D − n sD2 = (n − 1) 2

sD2 = sD2 =

1849 11 (10)

193 −

24.91 (10)

432 193 − 2 11 sD = (11 − 1)

sD2 =

sD2 = 2.49

193 − 168.09 (10)

sD = 2.49 sD = 1.58

21

Calculating the Matched Samples t-Test t=

t=

D −0 sD n

3.91 .48

t=

3.91 − 0 1.58 11

t = 8.15

t=

3.91 1.58 3.32

tobt = 8.15

Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: two-tailed – t-Value = 8.15 – Degrees of freedom (df) • For the Matched Samples t-Test: – df = number of PAIRS of scores -1 – df = 11 - 1 = 10

– Again, we can calculate p exactly with SPSS

22

Calculating the Matched Samples t-Test • SPSS Output Paired Samples Correlations N Pair 1 time1 & time2

Correlation 11

Sig.

.916

.000

Paired Samples Test Paired Differences 95% Confidence Interval of the Difference Mean Pair 1 time1 - time2

Std. Deviation Std. Error Mean

3.90909

1.57826

.47586

Lower 2.84880

Upper 4.96938

t 8.215

df

Sig. (2-tailed) 10

.000

tobt (10) = 8.15, p < .05 p ≈ .0000009

Evaluating Statistical Significance of the t-Test tobt = 8.15

tcrit = - 2.228

tcrit = 2.228

0

Because tobt falls within the rejection region identified by tcrit we reject H0

23

Testing Hypotheses: Two Independent Samples • Probably the most common use of the tTest and the t-distribution • Compare the mean scores of two groups on a single variable – IV: Groups – DV: Variable of interest

• Groups must be independent of one another – Scores in 1 group cannot influence scores in the other group

Independent Samples t-Test

X1 − X 2 t= s x1 − x2

or

t=

X1 − X 2 s12 s22 + n1 n2

This test is calculated by dividing the mean difference between two groups by the “dispersion” or “variation” observed between the two groups

24

Independent Samples t-Test: Degrees of Freedom • 1 df lost for each σ estimated by s using xbar • Since there are two independent groups in this analysis, we must estimate σ twice • df = (n1 + n2) - 2

Independent Samples t-Test: Example • Let’s return to the example used for the matched samples test • As a competent researcher, you realize that simply showing a change over time is not enough to prove the efficacy of your treatment – People spontaneously change over time

• Show that an untreated control group does not change over the same period of time that your treatment group does change

25

Independent Samples t-Test: Example Time 1 Tx Group

Tx

SIB Scores

SIB Scores

=

Ctrl Group

Time 3

Time 2

?

SIB

SIB

Scores

Scores

Tx

SIB Scores

Independent Samples t-Test: Example • At time 1, the control and treatment SIB groups have equal SIB scores • Administer the treatment for 2 weeks to Tx group – The Control group receives no intervention during these two weeks

• Compare SIB scores of Tx and Control group after 2 weeks • Provide Control group w/ intervention if desired

26

Independent Samples t-Test: Example •

Research Hypothesis: – Your treatment for SIB will reduce SIB scores in the Tx group after 2 weeks •



H1: µt < µc

Null Hypothesis – Your treatment for SIB will have no effect •



H0: µt = µc

Evaluate the efficacy of your treatment

Independent Samples t-Test: Example Time 2 Data

Control 12 13 10 9 11 8 16 13 15 16 12 Tx

8 10 4 Ctrl Group 135

93

∑x2

1729

941

18225

2

Tx Group

∑x (∑x)2

7 10 9 11 9 17 6

8649

x-bar

12.27

8.45

s2

7.29

15.47

s

2.69

3.93

n

11

11

Descriptive Statistics N

Minimum

Maximum

Mean

Std. Deviation

ctrl

11

8.00

16.00

12.2727

2.68667

tx

11

2.00

17.00

8.4545

3.93354

Valid N (listwise)

11

27

Independent Samples t-Test: Example •

Select: •

Rejection region •



α = .05

“Tail” or directionality •

We have evidence that the treatment probably works, so we make a one-tailed hypothesis here (scores for the Tx group will be lower than the Control group at time 2)

Independent Samples t-Test: Example •

Generate sampling distribution of the mean assuming H0 is true •



Independent Samples t-Test

Given our sampling distribution: •

Conduct the statistical test

28

Independent Samples t-Test: Example t=

t=

X1 − X 2

8.45 − 12.27 15.47 7.29 + 11 11

t=

s12 s22 + n1 n2 − 3.82 1.41 + .66

t = −2.65

t=

− 3.82 2.07

t=

− 3.82 1.44

tobt(20) = -2.65

Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: one-tailed – t-Value = -2.65 – Degrees of freedom (df) • For the Independent Samples t-Test – (n1 + n2) - 2 – (11+11)-2 – 22 - 2 = 20

29

Evaluating Statistical Significance of the t-Test • SPSS Output Independent Samples Test Levene's Test for Equality of Variances

F Self-Injurious Behavior Equal variances assumed

Sig. .518

t-test for Equality of Means

t

.480

Equal variances not assumed

df

Sig. (2-tailed) Mean Difference

Std. Error Difference

2.658

20

.015

3.81818

1.43625

2.658

17.663

.016

3.81818

1.43625

tobt(20) = -2.65, p < .05 p ≈ .015

Evaluating Statistical Significance of the t-Test tcrit = - 1.725 tobt = -2.65

0

Because tobt falls within the rejection region identified by tcrit we reject H0

30

Independent Samples t-Test: One Complication • There is a slight problem with the form of the equation we used… – ONLY can be applied to groups with equal sample sizes – A major limitation in real-world research

t=

X1 − X 2 s12 s22 + n1 n2

Pooled Variance Estimate • This equation permits tests with different sample sizes • Generates an estimate of the total variance between groups weighted by the size of each group – Therefore, larger samples have a greater impact on the variance – Vice-versa for small samples

31

Pooled Variance Estimate 2 2 ( n − 1 ) s + ( n − 1 ) s 2 1 2 2 sp = 1 n1 + n2 − 2

Using the Pooled Variance Estimate X − X2 t= 1 s 2p s 2p + X1 − X 2 n1 n2 t= s12 s22 + X1 − X 2 t = n1 n2 1 1 s 2p + n1 n2

32

Using the Pooled Variance Estimate: Example Time 2 Data

Control 11 16 13 15 16 12 Tx

8 10 4 Ctrl Group

No Data

7 10 9 11 9 17 6

2

Tx Group Descriptive Statistics

∑x

83

93

∑x2

1171

941

ctrl

6

11.00

16.00

13.8333

2.13698

tx

11

2.00

17.00

8.4545

3.93354

(∑x)2

6889

8649

Valid N (listwise)

x-bar

13.83

8.45

s2

4.57

15.47

s

2.14

3.93

n

6

11

N

Minimum

Maximum

Mean

Std. Deviation

6

Using the Pooled Variance Estimate: Example ( n1 − 1) s12 + ( n2 − 1) s22 s = n1 + n2 − 2

s 2p =

(11 − 1)15.47 + (6 − 1)4.57 11 + 6 − 2

s 2p =

(10)15.47 + (5)4.57 15

s 2p =

154.7 + 22.85 15

s 2p =

177.55 15

s 2p = 11.84

2 p

33

Using the Pooled Variance Estimate: Example t=

t= t=

X1 − X 2 1 1 s 2p + n1 n2

t=

8.45 − 13.83 1 1 11.84( + ) 11 6

− 5.38 11.84(.1667 + .0909)

− 5.38 3.05

t=

− 5.38 1.75

t=

− 5.38 11.84(.2576)

t = −3.07 tobt(15) = -3.07

Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: one-tailed – t-Value = -3.07 – Degrees of freedom (df) • For the Independent Samples t-Test – (n1 + n2) - 2 – (11+6)-2 – 17 - 2 = 15

34

Evaluating Statistical Significance of the t-Test • SPSS Output Independent Samples Test Levene's Test for Equality of Variances

F Self-Injurious Behavior Equal variances assumed

Sig. .714

t-test for Equality of Means

t

.411

Equal variances not assumed

df

Sig. (2-tailed) Mean Difference

Std. Error Difference

3.080

15

.008

5.37879

1.74614

3.653

14.979

.002

5.37879

1.47232

tobt(15) = -3.07, p < .05 p ≈ .0076

Evaluating Statistical Significance of the t-Test tcrit = - 1.753 tobt = -3.07

0

Because tobt falls within the rejection region identified by tcrit we reject H0

35

Effect Size of The Independent Samples t-Test

d=

µ1 − µ 2 σ

or

d=

X1 − X 2 sp

We use the same effect size conventions we identified for the Matched Samples test

Effect Size of The Independent Samples t-Test

X1 − X 2 d= sp d=

−5.38 11.84

8.45 − 13.83 d= 11.84

d = −.45

An effect size approaching the convention for a medium effect

36

t-test Assumptions • Although the t-test is generally a robust test, it can be affected by violations of underlying test assumptions – Normality – sampling distribution is normally distributed – Sample size – samples for each group should be of roughly equal size – Homogeneity of variance – σ1 = σ2

t-test Assumptions • One sample t-test – Normality - √ – Sample size - X – Homogeneity of variance – X

• Matched & Independent samples t-test(s) – Normality - √ – Sample size - √ – Homogeneity of variance – √

37

Impact of Violated Assumptions • For equal sample sizes… – …violating homogeneity of variance… • Minimal impact (α = .05 ± .02)

– …with minor normality violations… • Similar results as above

– …with major normality violations… • Severe skew (particularly in opposite directions) can lead to significant problems unless variances are fairly equal

Impact of Violated Assumptions • Unequal sample sizes… – Much more difficult to interpret – Unequal sample sizes + heterogeneity of variance = distortions in p • Possibly increased risk of Type I error • Risk of error increases as more assumptions are violated

38

Coping with Violated Assumptions •

What can we do to prevent or cope with violated assumptions? 1. Maintain equal sample sizes 2. Use trimmed samples… 3. Use a distribution free (i.e. non-parametric) test 4. Apply a statistical correction to t

Coping with Violated Assumptions • SPSS Output Independent Samples Test Levene's Test for Equality of Variances

F Self-Injurious Behavior Equal variances assumed Equal variances not assumed

Sig. .714

.411

t-test for Equality of Means

t

df

Sig. (2-tailed) Mean Difference

Std. Error Difference

3.080

15

.008

5.37879

1.74614

3.653

14.979

.002

5.37879

1.47232

If pF < .05, use the “Equal variances no assumed” row

39

Statistical Tests We Have Learned 1. z-Test • • •

1 group 1 set of data µ & σ known

2. One-Sample t-Test • • • •

1 group 1 set of data µ known Estimate σ with s using x-bar

3. Matched Samples tTest • • • •

1 group 2 sets of data µ & σ unknown Estimate σD with sD using D-bar

4. Independent Samples t-Test • • • •

2 groups 2 sets of data µ & σ unknown Estimate σ twice with s using x-bar

Choosing the Best Test

40

Choosing the Best Test • Flow-chart available on the website: – http://www.personal.kent.edu/~marmey

• Also refer to the diagram on p. 11 of your Howell text • Try the review problems on the website for an example of the types of questions I might ask on an exam!

41