BIOM5010: Statistics Slides 2G.1
BIOM5010: Statistics #2G Confidence Intervals Statistical Testing Statistical Power
(C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
Confidence Intervals
Slides 2G.2
• Text: Estimation IX(E)
Population Sample Sample Sample
Popula tion mean an e le m p m Sa
Confidence Interval Typical intervals are 95% or 99%
Sample
If repeated samples were taken and the 95% confidence interval computed for each sample, 95% of the intervals would contain the population mean (C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
What's wrong here?
Slides 2G.3
Source: phdcomics.com (C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics Slides 2G.4
Confidence Interval Calculation
• t distribution: – you estimate mean and std from the data (normal case)
• Normal distribution: – you know std; esitimate mean from data (unusual case)
• Using t distribution – Calc: m (sample mean) – Calc: s (sample std) – DF: Degrees of Freedom DF = N – 1
– SE: Standard error
sm= s/ √ N – Find tCL in t distribution Student's t distribution (wikipedia) (C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics Slides 2G.5
Confidence Interval Calculation
– Prev: Calculate: m, s, DF, sm
– Find tCL in t distribution – Low Lim = M – tCLsM – Upp Lim = M + tCLsM
DF
95%
99%
2
4.303
9.925
3
3.182
5.841
100
1.984
2.626
• Example (95% conf) – N=100, M=10, s=5 – DF = 99 – SM = 5/sqrt(100) = 0.5 – TCL = 1.984 – CI: Low lim = 10–TCL×0.5 – CI: Upp lim = 10+TCL×0.5
95% of Area
• CI= [9.008 ... 10.992] Student's t distribution (wikipedia) (C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
t Tests
Slides 2G.6
• Example, students asked to rate (on 1 ... 7) whether “you think animal research is wrong?” • Question: Is there a difference between population Mmale and Mfemale in answer?
Group
n
M
Female 17 5.35 Male
17 3.88
s 1.67 1.73
• Assumptions of the t-test – The two populations have the same variance – The populations are normally distributed. – Each value sampled independently from each other • If one subject provides two scores, then the scores are not independent.
(C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
t Tests
Slides 2G.7
• H0: There is no difference between the groups • t = (ΔM – MH0)/sM – ΔM = difference of means
= MF–MM = 1.470
=0 – MH0 = hypothesized value – sM = estimated standard error
√
2
2
σ1 σ2 • sM = + n1 n 2
= sqrt(1.672/17 + 1.732/17) = .581
• t = 1.47/.581 = 2.53 • DF = (n1 – 1) + (n2 – 1) = 32
(C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
One– vs two– tailed
Slides 2G.8
What was the research question? Two tailed: – Q: Is there a difference between population means male/female in answer? – H0: There is no difference.
(C) 2011 – 2014. Andy Adler.
One tailed: – Q: Is female pop. mean > than male pop mean? – H0: The pop mean female is not greater.
BIOM5010: Statistics
Implementation
Slides 2G.9
• Tables can be annoying • Matlab charges $$$ for stats toolbox function p= two_sided(T,DF); xi = 1 / ( 1 + T^2 / DF ); p= 1.0*betainc (xi, DF/2, 0.5); function p= one_sided(T,DF); xi = 1 / ( 1 + T^2 / DF ); p= 0.5*betainc (xi, DF/2, 0.5);
(C) 2011 – 2014. Andy Adler.
DF
95%
99%
2
4.303
9.925
3
3.182
5.841
100
1.984
2.626
>> %Usage two_sided(3.182,3) ans = 0.0500 (=>95%)
BIOM5010: Statistics
Example
Slides 2G.10
• Two tailed question: – Is there a difference between population means male/female in answer? – H0: There is no difference
• One tailed question: – Is population mean for female greater than for male? – H0: The population mean for female is not greater.
(C) 2011 – 2014. Andy Adler.
>> %two sided two_sided(2.53,32) ans = 0.0165
p-value >> %one sided one_sided(2.53,32) ans = 0.0082
BIOM5010: Statistics
Questions
Slides 2G.11
• What are the assumptions on the t test? • What is the relationship between the confidence interval and the p-value? Can you estimate one from the other? • Why does the 2-tailed test give different values to the 1-tailed test? • If an instructor puts a table of t-test values on an exam, are you required to use it in the questions?
(C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
Statistical Power
Slides 2G.12
Scenario: You work for ABC pharma. Doing stats. • They have invented a new drug. They think it's great. • need evidence its better than current (XYZ pharma). • Run a study – Randomize patients get: ABC vs. XYZ. – Outcome measure M = measure of health
• If you choose to few subjects, – SE – ...
(C) 2011 – 2014. Andy Adler.
will be too large
BIOM5010: Statistics
Statistical Power
Slides 2G.13
If you choose to few subjects, – – – – – –
SE t = ΔM / SE p-value Study Drug You
will will will will will will
be too large be too small be too large be non-significant not get approved be fired
• Statistical Power – p of correctly accepting H, when it is true – ability of a test to detect an effect (if it exists) – Formally, Statistical power of a test is • p(correctly rejects H0 when H0 is false)
(C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
Questions
Slides 2G.14
• How is Statistical power different from t-test result? • You've made some measurements, what do you apply? • You have a test plan, what do you apply? • How about this idea? 1. Get 10 patients 2. Do test 3. Is significant? • Yes => Collect bonus for inexpensive test plan • No => Add 10 more patients, go to step #1
(C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
Statistical Power
Slides 2G.15
• Statistical Power: the probability of correctly rejecting a false null hypothesis. power = 1 – β. – β = false null hypothesis = False Pos Rate = FPR – β = p value (“prob data occur by chance given H 0”).
• It is important to consider power while designing an experiment. • This will help avoid spending a lot of time and/or money on an experiment that has little chance of finding a significant effect.
(C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
Power example
Slides 2G.16
• You think your new drug is 10% better than the competition. (ie they=10 and you're at 11). Patient variability gives std of 5. How big of a trial to get power = 95% • p = 1 – 0.95 = .05 • Since we assume std, we can use normal distribution, or t distribution with large DF
One Tailed
• t = 1.645 • t = (ΔM – MH0)/sM • sM = (11 – 10)/1.645 = 0.607 •
√
2 1
2 2
√
2
2
DF
95%
99%
3
2.353
4.541
100
1.660
2.364
∞
1.645
2.326
2 σ σ 2σ 2×5 σ sM = + = 2 →n= 2 = =135.7 2 n1 n 2 n sm 0.607
(C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
Power example
Slides 2G.17
• You think your new drug is 10% better than the competition. (ie they=10 and you're at 11). Patient variability gives std of 5. How big of a trial to get power = 95% • sM = (11 – 10)/1.645 = 0.607
√
2 2 2 σ 21 σ 22 2 σ 2×5 s M= + = 2 σ →n= 2 = =135.7 2 n1 n 2 n s m 0.607
√
• Choose 136 (realistically N=200) – This is N=136 in each group – Recalculate t test at this N – Or use online power calculators (C) 2011 – 2014. Andy Adler.
One Tailed DF
95%
99%
3
2.353
4.541
100
1.660
2.364
∞
1.645
2.326
BIOM5010: Statistics
Questions
Slides 2G.18
• How is Statistical power different from t-test result? • You've made some measurements, what do you apply? • You have a test plan, what do you apply? • How about this idea? 1. Get 10 patients 2. Do test 3. Is significant? • Yes => Collect bonus for inexpensive test plan • No => Add 10 more patients, go to step #1
(C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics Slides 2G.19
Question: Statistical Power
Example: – Imagine a researcher wants to claim that Green(G) people are smarter than Purple(P) people.
• He measures head mass of a sample of volunteers from each group. Assume the true population statistics are μG=4.4kg, σG=0.6kg and μP=4.3kg, σP=0.6kg. – Can we use the t test here? – What is the confidence interval on the mean for each group for a sample size N=100 and N=10000. – What size of study (N) is required to achieve a power of 0.95?
(C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
Pairwise comparisons
Slides 2G.20
• Many experiments are designed to compare more than two conditions. • Example: study Smiles and Leniency. – effect of different types of smiles (false, felt, miserable, neutral) on leniency showed to a person
• Obvious way: t test of difference between each group mean and each other group mean. – – – –
Test 1: false ≠ felt Test 2: false ≠ miserable Test 3: false ≠ neutral ... – Number of tests: NT= (NM – 1)×NM/2 • NM: number of means (groups) (C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
Multiple Comparisons
Slides 2G.21
Source: xkcd.com/882/ (C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics Slides 2G.22
(C) 2011 – 2014. Andy Adler.
Multiple Comparisons
BIOM5010: Statistics Slides 2G.23
(C) 2011 – 2014. Andy Adler.
Multiple Comparisons
BIOM5010: Statistics
What to do?
Slides 2G.24 • Bonferroni Correction – onlinestatbook.com/chapter10/pairwise_correlated.html – Divide significance by the number of comparisons – For 20 jelly beans: α/NT = .05/20 = .0025
• ANOVA – Analysis of Variance (ANOVA) compares means. – general rather than specific differences • H0: All means are equal
• Tukey HSD – Honestly Significant Difference test – Similar to T test for each pair of means – DF = Nobvervations – NM – Use “studentized range calculator”
• Comment: understand the issue and the Bonferroni concept. Look up the other tests if you have to. (C) 2011 – 2014. Andy Adler.
BIOM5010: Statistics
Questions
Slides 2G.25
• What is the multiple comparisons problem? – What can happen if we forget to do it?
• In the “Smiles and Leniency” study, what would the Bonferroni correction calculate (significance = .05) – onlinestatbook.com/2/case_studies/leniency.html
• How is the ANOVA test different from the Tukey HSD test? (Difference in H0) • Philosophical problem: we look at events until we see one that's unusual. Then we do “science” and see whether the effect is significant. What is NM?
(C) 2011 – 2014. Andy Adler.