Interim Analysis in Clinical Trials Tatsuki Koyama, Ph.D. Division of Cancer Biostatistics Department of Biostatistics, Vanderbilt University School of Medicine CRC Research Skills Workshop July 31st, 2009

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

Clinical Trial - is a clinical investigation of a drug that is administered or dispensed to human subjects. - is a prospective study comparing the effects and values of intervention(s) against a control in human subjects. Phase I trials Objective to determine a safe drug dose Design usually dose escalation/de-escalation Subjects healthy volunteers or patients with disease Phase II trials Objective to determine a safe drug dose Design often single arm Subjects patients with disease Phase III trials Objective to compare efficacy of the new treatment with the standard regimen Design usually randomized control Subjects patients with disease Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

Monitoring Trial Progress Not-so-statistical Data and Safety Monitoring Board (DSMB) usually meet 2-3 times per year (or e.g., after N = 100) “independent” group with expertise in various disciplines Should the investigator (study statistician) be included? Should data be unblinded? reviews baseline comparability reviews design assumptions such as accrual rate, drop-out rate, event rate, patient compliance / adherence, budget, ... . reviews data quality reviews safety and toxicity reviews efficacy?

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

A very secretive example A clinical trial was planned with the final sample size of 250 per treatment. There is going to be one interim analysis with possible termination (efficacy or futility) at N = 150. The study was powered to detect a difference of 37.5. The (common) standard deviation was assumed to be 128. ) ( 2(128)2 . X¯ t − X¯ c ∼ Normal µt − µc , 250 The first DSMB review was conducted after N = 50 per group. (Semi-sort-of-blinded) Conditioned on X¯ 1t − X¯ 1c = x¯ 1t − x¯ 1c , 1 4 X¯ t − X¯ c = (¯x1t − x¯ 1c ) + (X¯ 2t − X¯ 2c ) 5 5 ( ∼ Normal Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

1 4 2(128)2 (¯x1t − x¯ 1c ) + (µt − µc ), 5 5 250(4/5)

( )2 ) 4 . 5

A very secretive example The estimates of the group means were 84 and 63. Assume the treatment effect is in the right direction. Unconditionally, X¯ t − X¯ c ∼ Normal (µt − µc , 131) . ( ) 21 4 Conditionally, X¯ t − X¯ c ∼ Normal + (µt − µc ), 105 . 5 5 The estimates of the within-group standard deviations were 151 and 121. √ The pooled standard deviation was (1512 + 1212 )/2 = 137 A 95% confidence interval of the difference is (−17, 59). Compute the (conditional) power utilizing the information available. Should I use the original estimates (guess) of treatment effects and variance or the ones from the first 50 data or some combination of these? - Let’s use the original estimate of the within-group variance. σ 2 = 128 (s2 = 137) Avoid using a t distribution.

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

Conditional power

1.0 0.9 0.86 0.8

Conditional Power

0.7 0.6 0.5 0.45 0.4 0.3 0.2 conditional power original power

0.1 0.0 0.0

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

10.0

21.0 True Difference

30.0

37.5

Conclusions At the original (alternative) hypothesis, the power is still good. If we assume the observed value is the truth, the power is 45%. The confidence interval is wide; we don’t know anything.

Safety monitoring and efficacy monitoring are different.

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

Group sequential design It allows early termination for efficacy and/or futility. Boundaries are pre-determined. Nothing to decide once the trial begins.



5 4

type I error rate = 0.05 Power = 0.90 N = 400

O'Brien−Fleming

3 ● ●

Normal critical value

2





Pocock



● ●

1

O’Brien-Fleming E[N] = 405 under H0 E[N] = 304 under H1

0 −1 ●

−2













−3

Pocock E[N] = 472 under H0 E[N] = 287 under H1

−4 −5 ●

50 41

200

350

164

288 N

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

499 411

Blinded / Unblinded statistical analysis Safety monitoring, I think, should be unblinded. Compliance / accrual / adherence / budgetary monitoring, I think, should be unblinded. Statistical (efficacy) monitoring “If variance is bigger than expected, we may need to increase the sample size.” “If treatment effect is smaller than expected, we may need to increase the sample size.” ¯ and overall variance, s2 ) 1 blinded analysis (just overall average, X, 2 semi-blinded analysis (pooled variance, s2p ) 3 semi-sort-of-blinded analysis (group averages, X¯ ? , X¯ ? ; and group variance, s2? , s2? ) 4 unblinded analysis (group averages, X¯ t , X¯ c ; and group variance, s2t , s2c )

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

Semi-blinded is almost unblinded. 1 Blinded analysis We can still get a fairly good estimate of the within-group variance from the overall variance. ˆ s2o (effect size)2 ≈ 1 + 4 s2p 2 Reveal pooled variance only. From analysis of variance results, we know Overall variability = between-group variability + within-group variability. We know within-group variability (pooled variance), and overall variability is not difficult to obtain because we can get it without unblinding. Thus we can compute the between-group variability. When there are only 2 groups, it is a simple function of the difference of the group means. de facto unblinding

3 Group means and variances without group identity. Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

Complete unblinding 4 Unblinded analysis Mathematically nice Type I error rate can be controlled. Conditional power can be specified. Unconditional power can be controlled if design is pre-specified. Leaking information

Example The endpoint is tumor size. Difference in 2 groups is 10 under the alternative hypothesis. σ 2 is known (assumed) to be 16. α = 0.05, power = .90; then the sample size is 44 per group. Flexible design with an interim look

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

Example of adaptive interim design In stage 1, a sample of size 22 is taken for each group. If the stage 1 p-value is less than 0.01, then stop to conclude efficacy. Stop for futility: P[Stop for futility | H1 ]= 0.05 Sample size for stage 2 depends on the results from stage 1. The maximum total sample size is 54.

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

Total sample size Total Sample Size

Expected Sample Size

50 Expected Sample Size

Total Sample Size

50 40 30 20 10 0

Single−stage Design

40 Two−stage Design

30 20 10 0

0

2

4

6

8

10

Difference of Means in Stage 1

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

12

0

2

4

6

8

True Difference

10

12

Binomial outcome is a little nicer. First consider sort-of-unblinding, i.e., reveal only p1c . P[bad response] for the control group may be smaller than expected -The control group in a clinical trial tends to be healthier than the one from observational study on which pc estimate was based. -Patients in a clinical trial tend to get a better care. Then maybe using p1c from the study to recompute the sample size may improve the study design. But there are problems. Because the overall response rate, p1 , can be computed without unblinding, consequently, we can easily computer p1t , too. If the total sample size is changed based on p1c , then the final estimate of P[bad response] in control group (pc ) is biased: Epc [ˆpc ] = pc + n1 cov(ˆp1c , 1/n∗ ), where n∗ is the new total sample size computed with pˆ 1c . Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

Blinded sample size re-assessment Suppose R = pt /pc . H0 : R = 1, H1 : R < 1. At the interim analysis compute pˆ 1 without unblinding. Solve pt /pc = R

pc + pt = pˆ 1 2

to get p˜ c =

2ˆp1 1+R

p˜ t =

2Rˆp1 1+R

and use these to recompute the sample size. cov(pˆ1 , pˆ1c − pˆ1t ) =

pc (1 − pc ) − pt (1 − pt ) . 2n1

Under H0 , pˆ1 and pˆ1c − pˆ1t are uncorrelated and asymptotically independent. Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

How adaptive is adaptive Interim looks do not cause any statistical difficulties if no changes are to be made. Issues arise when you allow for unplanned design changes. Type I error control Generalizability of the results Biases Unbiased estimator, even in very simple normal cases, is difficult to compute (maybe impossible without unverifiable assumptions) P-values are not simple. Confidence intervals, too.

Maybe some design changes can be data-dependent, but they should be rigorously pre-determined. Nothing should be “to be determined”.

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

FDA’s draft guidance on adaptive designs in drug development Robert T. O’Neill, PhD / OTS, CDER, FDA ⋅ ⋅ ⋅ , an adaptive design clinical study is defined as a study that includes a prospectively planned and specified modification or potential for modification of one or more specified aspects of the study design and hypotheses, based on analysis of data from subjects in the study. Analyses of the accumulating study data are often performed at prospectively planned points within the study, may be performed in a fully blinded manner or in a non-blinded manner, ⋅ ⋅ ⋅ . The term “prospective” here means that the adaptation was planned before data were examined in an unblinded form by any personnel involved in planning for the revision. ⋅ ⋅ ⋅ Revisions made or proposed after an unblinded interim analysis raise major concerns about study integrity, ⋅ ⋅ ⋅ possible design changes need to be prospectively defined and carefully implemented to avoid risking irresolvable uncertainty in the interpretation of study results.

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials

Interim Analysis in Clinical Trials Tatsuki Koyama, Ph.D. Division of Cancer Biostatistics Department of Biostatistics, Vanderbilt University School of Medicine CRC Research Skills Workshop July 31st, 2009

Tatsuki Koyama, Ph.D. Interim Analysis in Clinical Trials