sequential design

VI. Stopping of trials/sequential design Terminating based on early results Look for early “significant” results Blind reporting Periodic peeks increa...
Author: Samuel Caldwell
3 downloads 1 Views 231KB Size
VI. Stopping of trials/sequential design Terminating based on early results Look for early “significant” results Blind reporting Periodic peeks increase the false – positive rate Example

Reject if | Z j | > 1.96 1.96

Zj 1

2 ...

n ...

N = planned sample size - 1.96

N fixed to get α, 1-β Some paths would return to accept H0, but show earlier rejection. Note: also increases power VI - 1

Number of interim Analyses

Type I error if we reject whenever P < 0.05

1 2 3 4 5 10 100

0.05 0.08 0.11 0.13 0.14 0.19 0.37

VI - 2

Data-dependent stopping rules • Once a trial has begun, it should be continued only if: – It remains ethical to randomly assign the study treatments. – The study continues to have the potential to achieve its scientific goals.

• To achieve this goal, we monitor the study data as they accumulate.

VI - 3

Outline • What can we do at an Interim Analysis (IA) • IA Methods – Group sequential methods – Futility analyses

• Operational Issues • Data Monitoring Committees • Recommendations and references

VI - 4

Reasons for stopping early • • • • •

Treatments convincingly different Treatments convincingly not different Unacceptable side effects or toxicity Accrual too slow External information makes the trial unnecessary or unethical • Poor execution compromises study objectives • Catastrophic fraud or misconduct VI - 5

What can we do at an IA? • Assess sample size assumptions (variability, control group response rate) made in the design phase – Re-estimate sample size

• • • • • •

Assess impact of important variables Use observed information to plan future studies Check adherence, compliance, and accrual rates Review administrative issues Drop treatment arms that are unsafe or non-efficacious Make dose adjustments VI - 6

Tension in the decision to stop • The Harvard ECMO trial stopped randomization with – 4 deaths in 10 conventionally treated infants – 0 deaths in 10 ECMO patients

• ISIS-2: Many believe that it continued too long. 17,187 patients were enrolled. A few weeks into the study, the death rates were – streptokinase - 12.0%, placebo - 9.2% (P < 10-8) – Because it was so large, ISIS-2 changed practice VI - 7

Approaches to data-dependent stopping • Fully sequential methods • Group sequential methods – Fixed monitoring schedules – Rules based on Type I spending functions

• Methods based on projections – Stochastic curtailment

VI - 8

Common elements • Formal stopping rules require a single quantitative endpoint. • Data must be available in a cumulative fashion. • Monitoring of safety and toxicity issues can be more open-ended. • Data-dependent stopping involves tradeoffs.

VI - 9

Types of sequential procedures Wald sequential probability ratio test – open design Armitage – close designs Loss function – patient horizon trade-off I) Placing next patient on possibly inferior treatment II) Making the wrong conclusion and subjecting remaining patients to inferior treatment. Repeated significance tests adjustments for the fact that tests are repeated. Group sequential methods. Most often we design fixed sample studies. Use fixed sample tests applied sequentially. Leads to more false positives. VI - 10

Compare two hypotensive agents Phenactropinium chloride

30

20

10

0

Trimetaphan

Excess preferences

Closed sequential design (Armitage)

Number of Preferences 10

20

30

40

50

60

70

-10

-20

-30

Fig. 3.8 Trial of Robertson & Armitage (1959) to compare two hypotensive agents, using a restricted plan with 2α = 0.05, 1 - β = 0.95, θ1 = 0.75, N = 62.

Early Trimetaphan successes Eventually observed no preference

VI - 11

Adjusting for multiple comparisons If each independent analysis has a Type I error of α, then the chance of getting at least one Type I error in k analyses is: 1 – ( 1 - α)k = α0 Bonferroni inequality: ( 1 - α)k > 1 – k α Thus, 1 – ( 1 - α)k < k α So, if we want the experiment-wide Type I error to be less than α0 = 0.05, then we set each of the k α values equal to α0 / k. VI - 12

Adjusting for multiple comparisons Holm’s Inequality: Order the k p-values from smallest to largest: p1 ≤ p2 ≤ p3 ≤ … ≤ pk And call each significant as long as Pj ≤ α0 / (k - j + 1) This guarantees an experiment-wide Type I error of α0, but is less conservative then than the Bonferroni Inequality (i.e. has more power). VI - 13

Group sequential methods • Suppose that we agree to analyze the data after groups of prespecified size have been collected and compare the test statistic to stopping boundaries at each interim analysis. • For example, consider the comparison of event rates in two treatment groups with event probabilities π1 and π2.

VI - 14

Group sequential methods • Assume that we will accumulate up to K groups of subjects. Each group will consist of n subjects enrolled in each of two treatment arms. • After each group is enrolled, we will calculate the usual Z statistic and compare it to prespecified boundaries. – (Note that the outcomes must be available)

VI - 15

• Let Zk, the test statistic after k groups, be given by

Zk =

(πˆ 1k − πˆ 2k ) 2π k(1 − π k ) / nk

• We want to specify boundaries, BUi and BLi, for each interim analysis. • If Zk > BUi, stop and reject H0: π1 = π2 in favor of Ha: π1 > π2 • If Zk < BLi, stop and reject H0 in favor of Ha: π1 < π2 • If the trial is completed without exceeding the boundaries, do not reject H0. VI - 16

Determining the stopping boundaries • Pocock was the first to develop a method for determining critical values that maintained specified Type I and Type II error rates. • His method used the fact that Z1, ... , ZK have an asymptotic joint normal distribution. • Thus, we can calculate probabilities such as P(Zk>BUk|B1L