Introduction to Biostatistics, Harvard Extension School
Sample Size and Power
© Scott Evans, Ph.D.
1
Introduction to Biostatistics, Harvard Extension School
Sample Size Considerations A pharmaceutical company calls and says, “We believe we have found a cure for the common cold. How many patients do I need to study to get our product approved by the FDA?”
© Scott Evans, Ph.D.
2
1
Introduction to Biostatistics, Harvard Extension School
Where to begin?
N = (Total Budget / Cost per patient)? Hopefully not!
© Scott Evans, Ph.D.
3
Introduction to Biostatistics, Harvard Extension School
Where to begin? Understand the research question Learn about the application and the problem. Learn about the disease and the medicine.
Crystal Ball Visualize the final analysis and the statistical methods to be used. © Scott Evans, Ph.D.
4
2
Introduction to Biostatistics, Harvard Extension School
Where to begin? Analysis determines sample size. Sample size calculations are based upon the planned method of analysis. If you don’t know how the data will be analyzed (e.g., 2-sample t-test), then you cannot accurately estimate the sample size. © Scott Evans, Ph.D.
5
Introduction to Biostatistics, Harvard Extension School
Sample Size Calculation Formulate a PRIMARY research question. Identify: 1. A hypothesis to test (write down H0 and HA), or 2. A quantity to estimate (e.g., using confidence intervals) © Scott Evans, Ph.D.
6
3
Introduction to Biostatistics, Harvard Extension School
Sample Size Calculation Determine the endpoint or outcome measure associated with the hypothesis test or quantity to be estimated. How do we “measure” or “quantify” the responses? Is the measure continuous, binary, or a timeto-event? © Scott Evans, Ph.D.
7
Introduction to Biostatistics, Harvard Extension School
Sample Size Calculation Based upon the PRIMARY outcome Other analyses (i.e., secondary outcomes) may be planned, but the study may not be powered to detect effects for these outcomes.
© Scott Evans, Ph.D.
8
4
Introduction to Biostatistics, Harvard Extension School
Sample Size Calculation Two strategies Hypothesis Testing Estimation with Precision
© Scott Evans, Ph.D.
9
Introduction to Biostatistics, Harvard Extension School
Sample Size Calculation Using Hypothesis Testing
By far, the most common approach.
The idea is to choose a sample size such that both of the following conditions simultaneously hold:
If the null hypothesis is true, then the probability of incorrectly rejecting is (no more than) α
If the alternative hypothesis is true, then the probability of correctly rejecting is (at least) 1-β = power.
© Scott Evans, Ph.D.
10
5
Introduction to Biostatistics, Harvard Extension School
Reality
Test Result
Ho True
Ho False
Reject Ho
Type I error (α)
Power (1-β)
Do not reject Ho
1-α
Type II error (β)
© Scott Evans, Ph.D.
11
Introduction to Biostatistics, Harvard Extension School
Determinants of Sample Size: Hypothesis Testing Approach α β An “effect size” to detect Estimates of variability © Scott Evans, Ph.D.
12
6
Introduction to Biostatistics, Harvard Extension School
What is Needed to Determine the Sample-Size? α Up to the investigator or FDA regulation (often = 0.05) How much type I (false positive) error can you afford?
© Scott Evans, Ph.D.
13
Introduction to Biostatistics, Harvard Extension School
What is Needed to Determine the Sample-Size? 1-β (power) Up to the investigator (often 80%-90%) How much type II (false negative) error can you afford? Not regulated by FDA © Scott Evans, Ph.D.
14
7
Introduction to Biostatistics, Harvard Extension School
Choosing α and β Weigh the cost of a Type I error versus a Type II error. In early phase clinical trials, we often do not want to “miss” a significant result and thus often consider designing a study for higher power (perhaps 90%) and may consider relaxing the α error (perhaps 0.10). In order to approve a new drug, the FDA requires significance in two Phase III trials strictly designed with α error no greater than 0.05 (Power = 1-β is often set to 80%).
© Scott Evans, Ph.D.
15
Introduction to Biostatistics, Harvard Extension School
Effect Size The “minimum difference (between groups) that is clinically relevant or meaningful”. Not readily apparent Requires clinical input Often difficult to agree upon Note for noninferiority studies, we identify the “maximum irrelevant or non-meaningful difference”.
© Scott Evans, Ph.D.
16
8
Introduction to Biostatistics, Harvard Extension School
Estimates of Variability Often obtained from prior studies Explore the literature and data from ongoing studies for estimates needed in calculations
Consider conducting a pilot study to estimate this May need to validate this estimate later © Scott Evans, Ph.D.
17
Introduction to Biostatistics, Harvard Extension School
Other Considerations 1-sample vs. 2-sample Independent samples or paired 1-sided vs. 2-sided
© Scott Evans, Ph.D.
18
9
Introduction to Biostatistics, Harvard Extension School
Example: Cluster Headaches
A experimental drug is being compared with placebo for the treatment of cluster headaches.
The design of the study is to randomize an equal number of participants to the new drug and placebo.
The participants will be administered the drug or matching placebo. One hour later, the participants will score their pain using the visual analog scale (VAS) for pain.
A continuous measure ranging from 0 (no pain) to 10 (severe pain).
© Scott Evans, Ph.D.
19
Introduction to Biostatistics, Harvard Extension School
Example: Cluster Headaches The planned analysis is a 2-sample ttest (independent groups) comparing the mean VAS score between groups, one hour after drug (or placebo) initiation H0: μ1=μ2 vs. HA: μ1≠μ2 © Scott Evans, Ph.D.
20
10
Introduction to Biostatistics, Harvard Extension School
Example: Cluster Headaches It is desirable to detect differences as small as 2 units (on the VAS scale). Using α=0.05 and β=0.80, and an assumed standard deviation (SD) of responses of 4 (in both groups), 63 participants per group (126 total) are required.
STATA Command: sampsi 0 2, sd(4) a(0.05) p(.80)
Note: you just need a difference of 2 in the first two numbers
http://newton.stat.ubc.ca/~rollin/stats/ssize/n2.html © Scott Evans, Ph.D.
21
Introduction to Biostatistics, Harvard Extension School
Example: Part 2 Let’s say that instead of measuring pain on a continuous scale using the VAS, we simply measured “response” (i.e., the headache is gone) vs. non-response.
© Scott Evans, Ph.D.
22
11
Introduction to Biostatistics, Harvard Extension School
Example: Part 2 The planned analysis is a 2-sample test (independent groups) comparing the proportion of responders, one hour after drug (or placebo) initiation H0: p1=p2 vs. HA: p1≠p2
© Scott Evans, Ph.D.
23
Introduction to Biostatistics, Harvard Extension School
Example: Part 2 It is desirable to detect a difference in response rates of 25% and 50%. Using α=0.05 and β=0.80, STATA Command: sampsi 0.25 0.50, a(0.05) p(.80) 66 per group (132 total) w/ continuity correction
http://newton.stat.ubc.ca/~rollin/stats/ssize/b2.html 58 per group (116 total) without continuity correction © Scott Evans, Ph.D.
24
12
Introduction to Biostatistics, Harvard Extension School
Notes for Testing Proportions One does not need to specify a variability since it is determined from the proportion. The required sample size for detecting a difference between 0.25 and 0.50 is different from the required sample size for detecting a difference between 0.70 and 0.95 (even though both are 0.25 differences) because the variability is different. This is not the case for means. © Scott Evans, Ph.D.
25
Introduction to Biostatistics, Harvard Extension School
Caution for Testing Proportions Some software computes the sample size for testing the null hypothesis of the equality of two proportions using a “continuity correction” while others calculate sample size without this correction. Answers will differ slightly, although either method is acceptable. STATA uses a continuity correction The website does not © Scott Evans, Ph.D.
26
13
Introduction to Biostatistics, Harvard Extension School
Sample Size Calculation Using Estimation with Precision Not nearly as common, but equally as valid. The idea is to estimate a parameter with enough “precision” to be meaningful. E.g., the width of a confidence interval is narrow enough
© Scott Evans, Ph.D.
27
Introduction to Biostatistics, Harvard Extension School
Determinants of Sample Size: Estimation Approach α Estimates of variability Precision E.g., The (maximum) desired width of a confidence interval © Scott Evans, Ph.D.
28
14
Introduction to Biostatistics, Harvard Extension School
Example: Evaluating a Diagnostic Examination It is desirable to estimate the sensitivity of an examination by trained site nurses relative to an oral medicine specialist for the diagnosis of Oral Candidiasis (OC) in HIV-infected people. Precision: It is desirable to estimate the sensitivity such that the width of a 95% confidence interval is 15%. © Scott Evans, Ph.D.
29
Introduction to Biostatistics, Harvard Extension School
Example: Evaluating a Diagnostic Examination Note: sensitivity is a proportion The (large sample) CI for a proportion is:
⎡ ⎢ ⎢ ⎢ ⎣
pˆ −za/ 2
ˆp(1− pˆ) ˆ ˆp(1− pˆ) ⎤⎥ , p+za/ 2 ,⎥ n n ⎥⎦ © Scott Evans, Ph.D.
30
15
Introduction to Biostatistics, Harvard Extension School
Example: Evaluating a Diagnostic Examination We wish the width of the CI to be