Student’s t-Distribution The t-Distribution, t-Tests, Measures of Effect Size, & Managing Violations of Assumptions
Sampling Distributions Redux • Chapter 7 opens with a return to the concept of sampling distributions from chapter 4 – Sampling distributions of the mean
1
Sampling Distribution of the Mean • Because the SDotM is so important in statistics, you should understand it • The SDotM is governed by the Central Limit Theorem Given a population with a mean μ and a variance σ2, the sampling distribution of the mean (the distribution of sample means) will have a mean equal to μ, a variance equal to σ2/n, and a standard deviation equal to σ 2 / n . The distribution will approach the normal distribution as n, the sample size, increases. (p. 178)
Sampling Distribution of the Mean Translation: 1. For any population with a given mean and variance the sampling distribution of the mean will have: • • •
µx = μ σx2 = σ2/n σx = σ/√n
2. As n increases, the sampling distribution of the mean (µx) approaches a normal curve
2
Sampling Distribution of the Mean • Analysis: – Although µx and µ will tend to be similar to one another… – The relationships between… • σx2 and σ2 • σx and σ
– …will differ as a function of the sample size • We saw this in our sampling distribution of the mean example from chapter 4…
So, you wanna test a hypothesis, do ya? • Our understanding of sampling and sampling distributions now allows us to test hypotheses • How we test a hypothesis depends on the information we have available
3
Choosing a Test • µ?
1. Which variables are available?
– σ? – s?
• Number of data sets: –1 –2
2. How many data sets are you presented with? 3. Do your data sets come from 1 or 2 groups?
• Number of Groups –1 –2
Testing Hypotheses about Means: The Rare Case of Knowing σ • So far, to test the probability of finding a particular score, we’ve used the Standard Normal Distribution – IQ = 122 – µ = 100 – σ = 15
(x − x)
z= z=
σ (122 − 100) 15
z=
(22) 15
z = 1.47
-1.96 < z < 1.96 Fail to reject H0
4
How the z-Test Works • How does our test change when we test group means, not just individual scores? – We use the central limit theorem
How the z-Test Works n = 100
(122 − 100) 15 100 (122 − 100) z= 15 2 z=
n=2
n=1
z=
(122 − 100) 15 1
( 22) 15 10 ( 22) z= 15 1.41 z=
z=
( 22) 15 1
(22) 1.5
z = 14.67
(22) 10.64
z = 2.07
(22) 15
z = 1.47
z=
z=
z=
5
How the z-Test Works •
Large samples reduce the amount of random variance (sampling error) –
• •
More confidence that the sample mean = population mean
Larger samples improve our ability to detect differences between samples and populations For n = 1 (x − ) (x − µ) z= = z =
σ
µ
σ
n
Testing Hypotheses: When σ Is Unknown • Generally, the population standard deviation, σ, is unknown to us • Occasionally, we will know the population mean, µ, when we don’t know σ • In these situations, the standard normal distribution no longer meets our needs
6
Testing Hypotheses: When σ Is Unknown • Knowing µ… – We can produce an estimate of σ from s – Using s changes the nature of the test we are conducting, as s is not distributed in the same fashion as σ • Sampling distribution of the sample standard deviation is NOT normally distributed – Strong positive skew
Testing Hypotheses: When σ Is Unknown Sampling distribution of s
Sampling distribution of σ
7
So How Does s Estimate σ? • Given the differences in distribution shape, it is easy to conclude that s ≠ σ – s is an unbiased estimator of σ over repeated samplings – However, a SINGLE value of s is likely to underestimate σ • Because of this fact, small samples will systematically underestimate σ as a function of s
– This leads to any given statistic calculated from this distribution to be < a comparable value of z – We cannot use z any longer Æ t
t and the t-Distribution •
Developed by Student while he was working for the Guinness Brewing Co. 1. The shape of the t-distribution is a direct function of the size of the sample we are examining 2. For small samples, the t-distribution is somewhat flatter than the standard normal distribution, with a lower peak and fatter tails
8
t and the t-Distribution 3. As sample size increases: • •
•
The t-distribution approaches a normal distribution Theoretically, we mean that the closer that our sample comes to infinity, the more it looks like a normal distribution Practically, when n ~ 100 – 120
t and the t-Distribution
9
t and the t-Distribution 4. Identifying values of t associated with a given rejection region depends on: – α – the number of tails associated with the test – the degrees of freedom available in the analysis – For this one-sample test, (df = n-1) because we used one degree of freedom calculating s using the sample mean and not the population mean.
One-Sample t-Test ( x − µ) ( x − µ) t = ( x − µ) or or t = t= s 2 x sx sx n n
10
z-Test vs. One-Sample t-Test
z=
(x − µ)
σ
n
(x − µ) t= sx n
Note the similarities between these tests: ONLY the source of “variance” and the distribution you test against have changed!
Using the One-Sample t-Test • You are one the admissions board for a graduate school of Psychology. • You are attempting to determine if the GRE scores for the students applying to your program is competitive with the national average. – µVerbal = 569
• SPSS output from your data Descriptive Statistics N
Range
GRE
24
Valid N (listwise)
24
310.00
Mean 659.7917
Std. Deviation 86.43267
11
Using the One-Sample t-Test •
Research Hypothesis: – The GRE scores from your applicants differ from the population norms •
•
H1: µa ≠ µp
or
ES > 0
Null Hypothesis – The GRE scores from your applicants do not differ from the population norms •
•
H0: µa = µp
or
ES = 0
Evaluate the students’ GRE-V scores
Using the One-Sample t-Test •
Select: •
Rejection region •
•
α = .05
“Tail” or directionality •
•
We don’t know exactly how the students will score: we just expect them to show scores differing from the population values Might predict higher scores…
12
Using the One-Sample t-Test •
Generate sampling distribution of the mean assuming H0 is true •
•
One-Sample t-test
Given our sampling distribution: •
Conduct the statistical test
Using the One-Sample t-Test t=
t=
(x − µ) sx n
(659.79 − 569) 86.43 24
µVerbal = 569 x-bar = 659.79 s = 86.43 n = 24
t=
(90.79) 86.43 4.90
t=
(90.79) 17.64
t = 5.15
This numerical value is called tobt tobt(23) = 5.15
13
Using the One-Sample t-Test • SPSS Output
µ One-Sample Test Test Value = 569 95% Confidence Interval of the Difference
t GRE
5.146
df
Sig. (2-tailed) Mean Difference 23
.000
90.79167
Lower 54.2943
Upper 127.2890
tobt(23) = 5.15
Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: two-tailed – t-Value = 5.15 – Degrees of freedom (df) • For the One-Sample t-Test, df = n-1 (24-1 = 23) • Estimating s from x-bar (not σ from µ)
14
Evaluating Statistical Significance of the t-Test • In the past you… – Identified a tabled value of tcrit – Compare tcrit to our tobt value – If tobt falls into the rejection region identified by tcrit, then we reject H0 – If tobt does not fall into the rejection region identified by tcrit, then we fail to reject H0
• SPSS Simplifies matters by exactly calculating p for us
Using the One-Sample t-Test • SPSS Output
µ One-Sample Test Test Value = 569 95% Confidence Interval of the Difference
t GRE
5.146
df
Sig. (2-tailed) Mean Difference 23
.000
90.79167
Lower 54.2943
Upper 127.2890
tobt(23) = 5.15, p < .05 Exact probability ≈ .000003
15
Evaluating Statistical Significance of the t-Test tobt = 5.15
tcrit = - 2.069
tcrit = 2.069
0
Because tobt falls within the rejection region identified by tcrit we reject H0
Testing Hypotheses: Two Matched (Repeated) Samples • Sometimes, we’re interested in how a single set of scores change over time – – – –
Psychotherapy tx influences depression Patients respond to medication Consumer attitudes before and after an advertisement Changes in citizen attitudes following the State of the Union address
• When we look at two sets of scores collected from a single sample at different time points, we need to use a matched samples test
16
Matched Samples • Matched samples – Use the same participants at two or more different time points to collect similar data • MUST BE THE SAME SAMPLE! Time 1
Wait 30 Days
BDI - II
Time 2 BDI - II
Matched Samples Test • With a matched samples test, you are testing the change in scores between the two administrations of the test – H0: µ1 = µ2 – H0: µ1 - µ2 = 0
or
ES = 0
• This is truly the null hypothesis for the matched samples test
17
Matched Samples Test • Essentially, the group means at each time point mean little to us – Change in scores is the key – Conduct this test by obtaining the average difference score between the two time points
Matched Samples Test
D −0 t= sD n
D-bar represents average difference scores between time points
sD is the standard deviation of the difference scores -0 may seem redundant, but isn’t!
18
Calculating the Matched Samples t-Test • You are a researcher examining the impact of a new therapy intervention on the incidence of self-injurious behavior (SIB) • You collect a measure of the frequency of self-injurious acts when clients enter your treatment (time 1) • You collect a measure of the frequency of self-injurious acts two weeks later (time 2)
Calculating the Matched Samples t-Test •
Research Hypothesis: – The new treatment will change SIB scores •
•
H1: µ1 ≠ µ2
or
ES > 0
Null Hypothesis – The SIB scores at time 2 will be the same as the scores at time 1 (no change) • •
•
H0: µ1 = µ2 H0: µ1 - µ2 = 0
or
ES = 0
Evaluate SIB at time 1 & time 2
19
Using the One-Sample t-Test •
Select: •
Rejection region •
•
α = .05
“Tail” or directionality •
We don’t know exactly how the treatment will work, so we’d better use a two-tailed test
Using the One-Sample t-Test •
Generate sampling distribution of the mean assuming H0 is true •
•
Matched Samples t-test
Given our sampling distribution: •
Conduct the statistical test
20
Calculating the Matched Samples t-Test Time 1 13 14 8 10 11 13 15 16 19 10 7 Time 2
8 10 4
7 10 9 11 9 17 6
2
D
5
3
5
D2
25 16 16 9
4
4
1
4
4
7
2
4
1 16 16 49 4 16 25
∑D = 43
Descriptive Statistics
D = 3.91 ∑D2 = 193
N
Minimum
Maximum
Mean
Std. Deviation
time1
11
7.00
19.00
12.3636
3.58532
time2
11
2.00
17.00
8.4545
3.93354
Valid N (listwise)
11
(∑D)2 = 1849
Calculating the Matched Samples t-Test (∑ D ) 2 ∑D − n sD2 = (n − 1) 2
sD2 = sD2 =
1849 11 (10)
193 −
24.91 (10)
432 193 − 2 11 sD = (11 − 1)
sD2 =
sD2 = 2.49
193 − 168.09 (10)
sD = 2.49 sD = 1.58
21
Calculating the Matched Samples t-Test t=
t=
D −0 sD n
3.91 .48
t=
3.91 − 0 1.58 11
t = 8.15
t=
3.91 1.58 3.32
tobt = 8.15
Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: two-tailed – t-Value = 8.15 – Degrees of freedom (df) • For the Matched Samples t-Test: – df = number of PAIRS of scores -1 – df = 11 - 1 = 10
– Again, we can calculate p exactly with SPSS
22
Calculating the Matched Samples t-Test • SPSS Output Paired Samples Correlations N Pair 1 time1 & time2
Correlation 11
Sig.
.916
.000
Paired Samples Test Paired Differences 95% Confidence Interval of the Difference Mean Pair 1 time1 - time2
Std. Deviation Std. Error Mean
3.90909
1.57826
.47586
Lower 2.84880
Upper 4.96938
t 8.215
df
Sig. (2-tailed) 10
.000
tobt (10) = 8.15, p < .05 p ≈ .0000009
Evaluating Statistical Significance of the t-Test tobt = 8.15
tcrit = - 2.228
tcrit = 2.228
0
Because tobt falls within the rejection region identified by tcrit we reject H0
23
Testing Hypotheses: Two Independent Samples • Probably the most common use of the tTest and the t-distribution • Compare the mean scores of two groups on a single variable – IV: Groups – DV: Variable of interest
• Groups must be independent of one another – Scores in 1 group cannot influence scores in the other group
Independent Samples t-Test
X1 − X 2 t= s x1 − x2
or
t=
X1 − X 2 s12 s22 + n1 n2
This test is calculated by dividing the mean difference between two groups by the “dispersion” or “variation” observed between the two groups
24
Independent Samples t-Test: Degrees of Freedom • 1 df lost for each σ estimated by s using xbar • Since there are two independent groups in this analysis, we must estimate σ twice • df = (n1 + n2) - 2
Independent Samples t-Test: Example • Let’s return to the example used for the matched samples test • As a competent researcher, you realize that simply showing a change over time is not enough to prove the efficacy of your treatment – People spontaneously change over time
• Show that an untreated control group does not change over the same period of time that your treatment group does change
25
Independent Samples t-Test: Example Time 1 Tx Group
Tx
SIB Scores
SIB Scores
=
Ctrl Group
Time 3
Time 2
?
SIB
SIB
Scores
Scores
Tx
SIB Scores
Independent Samples t-Test: Example • At time 1, the control and treatment SIB groups have equal SIB scores • Administer the treatment for 2 weeks to Tx group – The Control group receives no intervention during these two weeks
• Compare SIB scores of Tx and Control group after 2 weeks • Provide Control group w/ intervention if desired
26
Independent Samples t-Test: Example •
Research Hypothesis: – Your treatment for SIB will reduce SIB scores in the Tx group after 2 weeks •
•
H1: µt < µc
Null Hypothesis – Your treatment for SIB will have no effect •
•
H0: µt = µc
Evaluate the efficacy of your treatment
Independent Samples t-Test: Example Time 2 Data
Control 12 13 10 9 11 8 16 13 15 16 12 Tx
8 10 4 Ctrl Group 135
93
∑x2
1729
941
18225
2
Tx Group
∑x (∑x)2
7 10 9 11 9 17 6
8649
x-bar
12.27
8.45
s2
7.29
15.47
s
2.69
3.93
n
11
11
Descriptive Statistics N
Minimum
Maximum
Mean
Std. Deviation
ctrl
11
8.00
16.00
12.2727
2.68667
tx
11
2.00
17.00
8.4545
3.93354
Valid N (listwise)
11
27
Independent Samples t-Test: Example •
Select: •
Rejection region •
•
α = .05
“Tail” or directionality •
We have evidence that the treatment probably works, so we make a one-tailed hypothesis here (scores for the Tx group will be lower than the Control group at time 2)
Independent Samples t-Test: Example •
Generate sampling distribution of the mean assuming H0 is true •
•
Independent Samples t-Test
Given our sampling distribution: •
Conduct the statistical test
28
Independent Samples t-Test: Example t=
t=
X1 − X 2
8.45 − 12.27 15.47 7.29 + 11 11
t=
s12 s22 + n1 n2 − 3.82 1.41 + .66
t = −2.65
t=
− 3.82 2.07
t=
− 3.82 1.44
tobt(20) = -2.65
Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: one-tailed – t-Value = -2.65 – Degrees of freedom (df) • For the Independent Samples t-Test – (n1 + n2) - 2 – (11+11)-2 – 22 - 2 = 20
29
Evaluating Statistical Significance of the t-Test • SPSS Output Independent Samples Test Levene's Test for Equality of Variances
F Self-Injurious Behavior Equal variances assumed
Sig. .518
t-test for Equality of Means
t
.480
Equal variances not assumed
df
Sig. (2-tailed) Mean Difference
Std. Error Difference
2.658
20
.015
3.81818
1.43625
2.658
17.663
.016
3.81818
1.43625
tobt(20) = -2.65, p < .05 p ≈ .015
Evaluating Statistical Significance of the t-Test tcrit = - 1.725 tobt = -2.65
0
Because tobt falls within the rejection region identified by tcrit we reject H0
30
Independent Samples t-Test: One Complication • There is a slight problem with the form of the equation we used… – ONLY can be applied to groups with equal sample sizes – A major limitation in real-world research
t=
X1 − X 2 s12 s22 + n1 n2
Pooled Variance Estimate • This equation permits tests with different sample sizes • Generates an estimate of the total variance between groups weighted by the size of each group – Therefore, larger samples have a greater impact on the variance – Vice-versa for small samples
31
Pooled Variance Estimate 2 2 ( n − 1 ) s + ( n − 1 ) s 2 1 2 2 sp = 1 n1 + n2 − 2
Using the Pooled Variance Estimate X − X2 t= 1 s 2p s 2p + X1 − X 2 n1 n2 t= s12 s22 + X1 − X 2 t = n1 n2 1 1 s 2p + n1 n2
32
Using the Pooled Variance Estimate: Example Time 2 Data
Control 11 16 13 15 16 12 Tx
8 10 4 Ctrl Group
No Data
7 10 9 11 9 17 6
2
Tx Group Descriptive Statistics
∑x
83
93
∑x2
1171
941
ctrl
6
11.00
16.00
13.8333
2.13698
tx
11
2.00
17.00
8.4545
3.93354
(∑x)2
6889
8649
Valid N (listwise)
x-bar
13.83
8.45
s2
4.57
15.47
s
2.14
3.93
n
6
11
N
Minimum
Maximum
Mean
Std. Deviation
6
Using the Pooled Variance Estimate: Example ( n1 − 1) s12 + ( n2 − 1) s22 s = n1 + n2 − 2
s 2p =
(11 − 1)15.47 + (6 − 1)4.57 11 + 6 − 2
s 2p =
(10)15.47 + (5)4.57 15
s 2p =
154.7 + 22.85 15
s 2p =
177.55 15
s 2p = 11.84
2 p
33
Using the Pooled Variance Estimate: Example t=
t= t=
X1 − X 2 1 1 s 2p + n1 n2
t=
8.45 − 13.83 1 1 11.84( + ) 11 6
− 5.38 11.84(.1667 + .0909)
− 5.38 3.05
t=
− 5.38 1.75
t=
− 5.38 11.84(.2576)
t = −3.07 tobt(15) = -3.07
Evaluating Statistical Significance of the t-Test • First note: – α = .05 – Tail or directionality: one-tailed – t-Value = -3.07 – Degrees of freedom (df) • For the Independent Samples t-Test – (n1 + n2) - 2 – (11+6)-2 – 17 - 2 = 15
34
Evaluating Statistical Significance of the t-Test • SPSS Output Independent Samples Test Levene's Test for Equality of Variances
F Self-Injurious Behavior Equal variances assumed
Sig. .714
t-test for Equality of Means
t
.411
Equal variances not assumed
df
Sig. (2-tailed) Mean Difference
Std. Error Difference
3.080
15
.008
5.37879
1.74614
3.653
14.979
.002
5.37879
1.47232
tobt(15) = -3.07, p < .05 p ≈ .0076
Evaluating Statistical Significance of the t-Test tcrit = - 1.753 tobt = -3.07
0
Because tobt falls within the rejection region identified by tcrit we reject H0
35
Effect Size of The Independent Samples t-Test
d=
µ1 − µ 2 σ
or
d=
X1 − X 2 sp
We use the same effect size conventions we identified for the Matched Samples test
Effect Size of The Independent Samples t-Test
X1 − X 2 d= sp d=
−5.38 11.84
8.45 − 13.83 d= 11.84
d = −.45
An effect size approaching the convention for a medium effect
36
t-test Assumptions • Although the t-test is generally a robust test, it can be affected by violations of underlying test assumptions – Normality – sampling distribution is normally distributed – Sample size – samples for each group should be of roughly equal size – Homogeneity of variance – σ1 = σ2
t-test Assumptions • One sample t-test – Normality - √ – Sample size - X – Homogeneity of variance – X
• Matched & Independent samples t-test(s) – Normality - √ – Sample size - √ – Homogeneity of variance – √
37
Impact of Violated Assumptions • For equal sample sizes… – …violating homogeneity of variance… • Minimal impact (α = .05 ± .02)
– …with minor normality violations… • Similar results as above
– …with major normality violations… • Severe skew (particularly in opposite directions) can lead to significant problems unless variances are fairly equal
Impact of Violated Assumptions • Unequal sample sizes… – Much more difficult to interpret – Unequal sample sizes + heterogeneity of variance = distortions in p • Possibly increased risk of Type I error • Risk of error increases as more assumptions are violated
38
Coping with Violated Assumptions •
What can we do to prevent or cope with violated assumptions? 1. Maintain equal sample sizes 2. Use trimmed samples… 3. Use a distribution free (i.e. non-parametric) test 4. Apply a statistical correction to t
Coping with Violated Assumptions • SPSS Output Independent Samples Test Levene's Test for Equality of Variances
F Self-Injurious Behavior Equal variances assumed Equal variances not assumed
Sig. .714
.411
t-test for Equality of Means
t
df
Sig. (2-tailed) Mean Difference
Std. Error Difference
3.080
15
.008
5.37879
1.74614
3.653
14.979
.002
5.37879
1.47232
If pF < .05, use the “Equal variances no assumed” row
39
Statistical Tests We Have Learned 1. z-Test • • •
1 group 1 set of data µ & σ known
2. One-Sample t-Test • • • •
1 group 1 set of data µ known Estimate σ with s using x-bar
3. Matched Samples tTest • • • •
1 group 2 sets of data µ & σ unknown Estimate σD with sD using D-bar
4. Independent Samples t-Test • • • •
2 groups 2 sets of data µ & σ unknown Estimate σ twice with s using x-bar
Choosing the Best Test
40
Choosing the Best Test • Flow-chart available on the website: – http://www.personal.kent.edu/~marmey
• Also refer to the diagram on p. 11 of your Howell text • Try the review problems on the website for an example of the types of questions I might ask on an exam!
41