Doubly robust estimates for longitudinal data analysis with missing response and missing covariates

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates Doubly robust estimates for lon...

Author: Gabriella Edwards

2 downloads 0 Views 562KB Size

Report

Download PDF

Recommend Documents

Modeling Nonignorable Missing Data With Item Response Theory (IRT)

Fast Error Analysis of Continuous GNSS Observations with Missing Data

Longitudinal Data Analysis CATEGORICAL RESPONSE DATA

Fetal echocardiography for congenital heart disease diagnosis: a meta-analysis, power analysis and missing data analysis

Optimal shape from motion estimation with missing and degenerate data

Robust Income Distribution Estimation with Missing Data. Maria-Pia Victoria-Feser University of Geneva

Multiple Imputation for Missing Data: Concepts and New Development

Missing Witnesses and Wills

BIOINFORMATICS. Simultaneous SNP Identification in Association Studies with Missing Data

GUIDE TO CRIMINAL ACTIVITY PREVENTION AND RESPONSE; MISSING PERSONS

Longitudinal AIDS Data Analysis

Solutions for Missing Data in Structural Equation Modeling

Using Data Mining to Estimate Missing Sensor Data

Missing Numbers, Fractions

CS4 THE MISSING MANUAL

KML - The Missing Manual

SPSS Missing Values 17.0

Missing Alumnae (continued)

THE MISSING 13TH AMENDMENT

Missing Middle Housing

Missing Donald McDuncan

MISSING BIBLE VERSES

Roundup Ready Soya: Incomplete data, missing evaluation and insufficient controls

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Doubly robust estimates for longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and Senior Biostatistician, NACC Professor, Department of Biostatistics University of Washington

October, 2009

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariates ADC, 2009

1 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

1

NACC UDS

2

Analysis of Complete Longitudinal Data

3

Estimating Equations for Missing Outcome

4

Methods for Handling Missing Covariates

5

New Method Model Formulation For Missing Response and Covariates Estimation and Inference

6

Simulations and Applications Simulations Applications

7

Summary

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariates ADC, 2009

2 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

A NACC example

Using the National Alzheimer’s Coordinating Center (NACC) Uniform Data Set (UDS), we are interested in assessing he association between patient’s characteristics and the onset of dementia. The response is the diagnosis of dementia (Yes/No). The covariates that may be related to the status of dementia include sex, congestive heart failure (CVCHF, yes/no), family history of dementia (FHDEM, yes/no), diabetes (yes/no), behavioral assessment (depression or dysphoria, yes/no), hypertension (yes/no), education (years), Mini-Mental State Exam (MMSE) score, and age.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariates ADC, 2009

3 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

A NACC example, continued

There are 16223 subjects from 29 Alzheimer’s Disease Centers included at the entry of this study. Follow-up visits for subjects are scheduled at approximately one-year intervals, with up to three follow-ups at present.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariates ADC, 2009

4 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

An example, continued

Due to some reasons, there are some missing data for the response and the behavioral assessment covariate. There are 8724 subjects with complete data on scheduled visits. About 11.9% subjects miss both the response and behavioral assessment; about 31.2% subjects miss the response but observe behavioral assessment; about 3.2% subjects miss the behavioral assessment but observe the response; and about 53.7% subjects observe both the response and the behavioral assessment covariate.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariates ADC, 2009

5 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

GEE Approach with Complete Longitudinal Data

The method of generalized estimating equations (GEE) is a popular method for analyzing longitudinal data. It requires only the specification of a model for the marginal mean and variance of each measurement and of a ”working” matrix for the correlation between measurements in a cluster.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariates ADC, 2009

6 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Notations

Let Yij denote the response of individual i at time j (i = 1, . . . , N; j = 1, . . . , Mi ). Let Yi = (Yi 1 , . . . , YiMi )T . Let xij denote a vector of covariates for individual i at time j, T )T . x = (xT , . . . , xT )T . and xi = (xiT1 , . . . , xiM i i1 iMi i Let µij = E (Yij | xij ), g (µij ) = β T xij ; let µi = (µi 1 . . . , µiMi )T .

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariates ADC, 2009

7 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

GEE for Complete Data Analysis

The GEE for complete data are N X

Ui (β, ρ; Yi , xi ) = 0,

i =1

where Ui (β, ρ; Yi , xi ) =

∂µT i Vi (ρ)−1 (Yi − µi ), ∂β

and Vi (ρ) is the working covariance matrix of Yi .

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariates ADC, 2009

8 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Asymptotic results

When xi contains only time-independent covariates, under some regularity conditions, the GEE yields estimators that are consistent. If xi includes some time-dependent covariates, the GEE still yields consistent estimators under one additional assumption that E (Yij | xi ) = E (Yij | xij ). If this is not the case, then for consistency the independent working correlation should be used.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariates ADC, 2009

9 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Time-dependent Covariates

Let Lij denote all the data that should be collected on individual i at time j. Let Lij denote the data available on individual i by time j. Let Lij denote the data not yet available by time j. Note that Lij includes both Yij and xij .

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

10 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Drop-out Let Rij = 1 if measurement j on individual i is observed and Rij = 0 otherwise. Assume monotone drop-out: Rij = 0 implies Rik = 0 for all times k > j. Let Cij = 1 if subject i s last observed measurement is at time j and 0 otherwise. We assume that the covariates included in Lij are chosen so that the data can assumed to be Missing at Random (MAR): P(Rij = 1|LiMi , Ri ,j−1 = 1) = P(Rij = 1|Li ,j−1 , Ri ,j−1 = 1). i.e., the probability of missingness only depends on the observed data. Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

11 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

GEE for Complete-Data

N X

Ui (β, ρ; Yi , xi ) = 0,

i =1

where Ui (β, ρ; Yi , xi ) =

∂µT i Vi (ρ)−1 (Yi − µi ), ∂β

and Vi (ρ) is the working covariance matrix of Yi . These equations yield estimates that are consistent if the data are Missing Completely at Random (MCAR), but not necessarily if they are MAR.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

12 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Re-weighting With missing data, we can base our estimates on the complete cases, but re-weight them according to the probability of being observed. The estimating equations are then N X ∂µT i

i =1

∂β

Vi (ρ)−1 ∆i (α)(Yi − µi ),

where ∆i (α) = diag(Ri 1 /πi 1 , . . . , RiMi /πiMi ) and πij = πij (α) is the probability, according to a specified dropout model, that measurement j on subject i is observed. Under the drop-out missing data, πij (α) = (1 − λi 1 (α)) . . . (1 − λij (α)), where λij (α) = P(Rij = 0 | Lij , Rij = 1). The resulting estimates are consistent if the data are MAR, as long as the probability model for the missingness is correctly Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

13 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Imputation Alternatively, we can impute, or “guess”, what the missing values are based on some probability model. Then the estimates are based on both the observed data and the imputed data. The complete case estimating equations are used, but after imputing missing responses with their expected values: E (Yij |Lik , Rik = 1), for j > k. The imputations are based on specified regression models. The resulting estimates are consistent if the data are MAR, as long as the probability model for the imputations is correct.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

14 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Doubly-robust Estimating Equations The inverse probability weighting estimates make no use of the available data on subjects with missing measurements. Let d(LM , β) = U(β, ρ; Y, x) be the contribution of a fully observed subject to the estimating equations. For drop-out missing data, the IPW estimating equations can be augmented by a term F (C , LC , β) satisfying EC {F (C , LC , β)|LM } = 0. The resulting augmented estimating equations are N X RiM i =1

i

πiMi

d(LMi , β) + F (C , LC , β) = 0.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

15 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Doubly-robust Estimating Equations (2) The optimal choice of augmentation term is Fopt (C , LC , β) =

M−1 X j=1

Cj − λj+1 Rj πj+1

Hj (β),

where Hj (β) = ELj {d(LM , β)|Lj , Rj = 1}. We specify models for Hj (β), j = 1, . . . , M − 1 which involve parameters γ. Let α ˆ and γ ˆ denote consistent estimators of α and γ. Then, in the estimating equations, replace λj , πj , and Hj with λj (α), πj (α), and Hj (β, γ ˆ ).

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

16 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Properties of DR Estimating Equations If: The data are MAR, the marginal model is correct, g (µij ) = β T xij , and either the dropout model πj , or the model for Hj (or both) is correctly specified, ˆ is consistent for β. then the solution to the estimating equations β Furthermore, if both the dropout model and the model for Hj ˆ is optimal in the sense that it are correct, then this solution β has the smallest asymptotic variance among estimates from augmented estimating equations. A consistent estimate of this variance exists in closed form.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

17 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Methods for Handling Missing Covariates

Lipsitz et al. (1999) considered the doubly robust estimate in the cross-sectional study with a missing covariate Notations: yi : response, xi : covariate vector that is always observed zi : covariate that is subject to missing ri : missing indicator for zi

Joint density of (ri , yi , zi |xi ) p(ri , yi , zi |xi ) = p(ri |yi , zi , xi , ω)p(yi |zi , xi , β)p(zi |xi , α) = p(ri |yi , xi , ω)p(yi |zi , xi , β)p(zi |xi , α)

(MAR)

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

18 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Score Equation for Complete Data

The likelihood-based score question:   n u1i (β) X  u2i (α)  = 0, i =1 u3i (ω) where ∂ log p(yi |xi ,zi ,β) ∂β ∂ log p(zi |xi ,α) ∂α ∂ log p(ri |xi ,yi ,zi ,ω) = ∂ω

u1i (β; yi , xi , zi ) = u2i (β; xi , zi ) = u3i (β; ri , xi , yi )

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

19 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Methods for Handling Missing Covariates

With missing data, the maximum likelihood estimating equations for γˆ = (βˆ′ , α ˆ′ , ω ˆ ′ )′ solves   ˆ n n u1i (β) X X γ) = ui∗ (ˆ γ) = E  u2i (ˆ u ∗ (ˆ α) observed data  = 0 i =1 i =1 u3i (ˆ ω)

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

20 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Methods for Handling Missing Covariates We can further show that   ri u1i (β; yi , xi , zi ) + (1 − ri )Ezi |yi ,xi [u1i (β; yi , xi , zi )] n X   u ∗ (γ) = ri u2i (α; zi , xi ) + (1 − ri )Ezi |yi ,xi [u2i (α; zi , xi )] i =1 u3i (ω; yi , xi , ri ) Solving u ∗ (ˆ γ ) = 0 we get the MLE ˆ α The asymptotic properties of (β, ˆ )′ don’t depend on the missing data model If p(yi |xi , zi ) and p(zi |xi ) are correctly specified, we can get ˆ α γ) = 0 consistent estimate of (β, ˆ )′ by solving u ∗ (ˆ If p(yi |xi , zi ) or/and p(zi |xi ) are misspecified, then βˆ will not be consistent Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

21 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Methods for Handling Missing Covariates Weighted GEE  ri  ri n πi u1i (β; yi , xi , zi ) + 1 − πi Ezi |yi ,xi [u1i (β; yi , xi , zi )] X ri ri   S(γ) = πi u2i (α; zi , xi ) + 1 − πi Ezi |yi ,xi [u2i (α; zi , xi )] i =1 u3i (ω; yi , xi , ri ) where πi = P(ri = 1|yi , xi ) Doubly robust estimate, i.e., solving S(ˆ γ ) = 0 can get asymptotic unbiased estimate for β when either πi or p(zi |xi ) is correctly specified EM algorithm for the estimate Asymptotic variance Var (ˆ γ) =

n n n nX h ∂S (γ) io−1 X nX h ∂S (γ) io−1 i i E E [Si (γ)Si′ (γ)] E ′ ∂γ ∂γ i =1 i =1 i =1

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

22 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation Notations Response: Yi = (Yi 1 , Yi 2 , . . . , YiJi )′ Covariate: Xi = (Xi 1 , Xi 2 , . . . , XiJi )′    0 Yij and Xij are missing  1 Yij is missing and Xij is observed Rij = 2 Yij is observed and Xij is missing    3 Yij and Xij are observed Covariate: Zi

[always observed]

Response model: µij = E (Yij |Xi , Zi ) var (Yij |Xi , Zi ) = κf (µij ) g (µij ) = Xij βx + Zij′ βz

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

23 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation (Continued)

Missing data models: λijk = P(Rij = k|R¯ij , Yi , Xi , Zi ), k = 0, 1, 2, 3 log

λ

ijk

λij0

= uijk ′ αk

k = 1, 2, 3

¯ij : missing response indicator history R ¯ij , Zi ) Covariate model: ωij = E (Xij |X h(ωij ) = vij′ γ ¯ij : covariate history X θ = (β ′ , α′ , γ ′ )′ , where β is of interest

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

24 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation (Continued)

MAR assumption: ¯ ij , Yi , Xi , Zi ) P(Rij = k|R ¯ ij , Y (o) , X (o) , Zi ) = P(Rij = k|R i i (o)

, Yi

(o)

, Xi

Yi = (Yi Xi = (Xi

(m)

(m)

)

)

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

25 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation (Continued) Weighted GEE (WGEE) for β: S1 (θ) =

n h X i =1

i Di Mi (Yi −µi )+EY (m) ,X (m)|Y (o) ,X (o) ,Z [Di Ni (Yi −µi )] = 0 i

−1/2

Mi = κ−1 Fi

−1/2

i

i

i

i

−1/2

[Ci−1 • ∆i ]Fi

−1/2

Ni = κ−1 Fi [Ci−1 • (11′ − ∆i )]Fi Fi = diag(var (Yij |Xij , Zij ), j = 1, . . . , Ji ) Ci : working correlation matrix ∆i = [δijk ] with δijk = [I (Rij = 1, Rik = 3) + I (Rij = 3, Rik = 3)]/πijk for j 6= k and δijj = I (Rij = 3)/πij Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

26 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation (Continued)

Weighted GEE (WGEE) for γ: S2 (θ) =

n h X i =1

i vi ∆∗i (Xi − ωi ) + EX (m) |X (o) ,Z [vi (I − ∆∗i )(Xi − ωi )] = 0 i

∆∗i = diag(I (Rij = 1 or 3)/πijx , πijx

i

i

j = 1, . . . , Ji )

= P(Rij = 1 or 3|Yi , Zi , Xi )

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

27 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation (Continued)

Estimation function for missing data parameter α: S3 (α) =

Ji X n X 3 X I (Rij = k) ∂λijk =0 λijk ∂α i =1 j=1 k=0

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

28 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Estimation and Inference

Solve estimating equations   ˆ S1 (θ) n X ˆ =  S2 (θ) ˆ = S(θ) Si (θ) = 0 i =1 S3 (ˆ α) EM algorithm for the estimation Variance estimate ˆ = Var (θ)

n n n h ∂S (θ) io−1 X nX nX h ∂S (θ) i′ o−1 i i E E [Si (θ)Si′ (θ)] E . ∂θ ∂θ i =1 i =1 i =1

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

29 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Estimation and Inference (Continued)

Doubly robust estimate If missing data model is correctly specified, we get asymptotic unbiased estimate for β no matter the model for the covariate is correctly specified or not If covariate model is correctly specified, we get asymptotic unbiased estimate for β no matter the model for the missing data is correctly specified or not

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

30 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Simulations Response model is logit(µij ) = β0 + β1 xij + β2 Zij , j = 1, 2, 3, with exchangeable correlation ρ. Covariate model logitωij = γ0 + γ1 Xi ,j−1 + γ2 Zij Missing data model log

λ ijk

λij0

= α0k + α1k1 I (Ri ,j−1 = 1) + α1k2 I (Ri ,j−1 = 2) (o)

(o)

+α1k3 I (Ri ,j−1 = 3) + α2k yi ,j−1 + α3k xi ,j−1

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

31 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Simulations (Continued) Methods considered 1

EM(x+): EM with correct covariate model

2

WGEE(x+, r +): WGEE with correct covariate and missing data models

3

WGEE(x−, r +): WGEE with incorrect covariate and correct missing data models

4

WGEE(x+, r −): WGEE with correct covariate and incorrect missing data models

5

WGEE(x−, r −): WGEE with incorrect covariate and incorrect missing data models

6

cc: complete case MLE

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

32 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Simulations (Continued) Table: Empirical bias, standard deviation and coverage probabilities for six approaches to estimation and inference with incomplete covariate and response data (ρ = 0.6, α2 = γ2 = −2)

β0 Method

β1

β2

Bias% SD CP% Bias% SD CP% Bias% SD CP%

EM(x+) -0.3 0.102 (x+, r +) 0.7 0.104 (x+, r −) -1.0 0.110 (x−, r +) 0.4 0.105 (x−, r −) -20.1 0.094 cc -302.0 0.876

94.9 95.1 95.2 94.4 91.4 53.8

-1.1 0.8 -1.6 1.0 12.0 49.9

0.077 0.080 0.088 0.084 0.081 1.077

94.3 94.5 94.9 94.8 92.9 96.8

0.5 -0.9 1.6 -0.3 3.0 0.4

0.091 0.093 0.102 0.096 0.096 1.218

94.8 94.9 95.0 94.5 93.9 94.6

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

33 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Table: Empirical bias, standard deviation and coverage probabilities for six approaches to estimation and inference with incomplete covariate and response data (ρ = 0.3, α2 = γ2 = −2)

β0 Method

β1

β2

Bias% SD CP% Bias% SD CP% Bias% SD CP%

EM(x+) -1.6 0.058 (x+, r +) 0.1 0.060 (x+, r −) 0.0 0.066 (x−, r +) 1.2 0.062 (x−, r −) -12.4 0.076 cc -219.6 0.784

94.4 -0.2 95.4 0.1 94.3 0.8 94.7 0.6 93.4 8.4 78.6 -27.0

0.067 0.072 0.071 0.079 0.077 1.065

95.3 95.1 94.9 94.8 94.1 97.2

1.1 0.3 0.2 -0.9 2.0 0.0

0.084 0.086 0.091 0.087 0.087 0.930

94.4 94.6 94.7 94.5 94.2 94.9

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

34 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Simulations (Continued)

Summary of the Simulations: EM algorithm gives consistent and most efficient estimate when the covariate model is correctly specified The proposed method yields negligible biases when either the covariate model or the missing data model is correctly specified If both the covariate and missing data model are misspecified, the proposed method yield biased result The complete case analysis gives biased estimate

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

35 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Impact of model misspecification

−5

0

% RELATIVE BIAS FOR β2

5

10

α2=−2 α2=−1 α2= 0 α2= 1 α2= 2

−4

−2

0

2

4

γ1

Figure: Asymptotic percent relative bias of β2 with misspecified covariate model and missing data model Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

36 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Application to the NACCUDS

Table: Frequency table for the responses and covariate for the

missingness (X , Y ) Time 1 2 3 4

(m, m) (o, m) (m, o) (o, o) 6.0 10.3 12.8 14.1

28.8 31.7 31.1 31.3

8.9 3.9 2.7 1.6

56.3 54.1 53.4 52.9

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing ADC, covariates 2009

37 / 43

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Application to the NACCUDS Table: Parameter estimate for the NACCUDS: proposed method,

n = 16223 Parameter (Intercept) SEX(F) CVCHF DEPRESSION MMSE FHDEM DIABETE HYPERT EDUC AGE

Est.

SE

p

-0.136 -0.203 -0.031 0.679 -0.002 0.181 -0.124 -0.195 -0.002 0.006

0.106 0.025 0.063 0.029 0.001 0.028 0.038 0.026 0.001 0.001

0.198