Regression analysis for long-term survival rate via empirical likelihood

Nonparametric Statistics Vol. 17, No. 8, December 2005, 995–1007 Regression analysis for long-term survival rate via empirical likelihood YICHUAN ZHA...
Author: Amanda Doyle
0 downloads 0 Views 118KB Size
Nonparametric Statistics Vol. 17, No. 8, December 2005, 995–1007

Regression analysis for long-term survival rate via empirical likelihood YICHUAN ZHAO* Department of Mathematics and Statistics, Georgia State University, Atlanta, GA 30303, USA (Received May 2004; revised October 2005; in final form October 2005) In recent years, regression models have been shown to be useful for predicting the long-term survival probabilities of patients in clinical trials. For inference on the vector of regression parameters, there are semiparametric procedures based on normal approximations. However, the accuracy of such procedures in terms of coverage probability can be quite low when the censoring rate is heavy. In this paper, we apply an empirical likelihood ratio (ELR) method to the regression model and derive the limiting distribution of the ELR. On the basis of the result, we develop a confidence region for the vector of regression parameters. Furthermore, we use a simulation study to compare the proposed method with the normal approximation-based method proposed by Jung [Jung, S., 1996, Regression analysis for long-term survival rate. Biometrika, 83, 227–232.]. Finally, the proposed procedure is illustrated with data from a clinical trial. Keywords: Confidence region; Conditional Kaplan–Meier estimator; Link function; Normal approximation; Right censoring

1.

Introduction

In medical studies, we are interested in the relationship between the probability of observing an event during a certain prespecified time interval and a set of covariates. The covariates such as age, weight, and gender have an effect on the survival rate. The survival rate at a prespecified time is an effective measure of a patient’s survival. A regression model for the survival rate is a valuable model which incorporates information from covariates, and it is simple and easy for practitioners to interpret to analyze survival data. We consider the problem of regressing a monotone transformation of the survival rate on the covariate vector. Let Ti (i = 1, . . . , n) be the failure time to a specific event for patient i. Let Zi = (Z0i , Z1i , . . . , Zpi ) , where Z0i = 1 is the corresponding p + 1 dimensional covariate vector. The censoring variable Ci is assumed to be conditionally independent of Ti given the covariate Zi for 1 ≤ i ≤ n. We observe (Xi , δi ), where Xi = min(Ti , Ci ) and δi = I (Xi = Ti ).

*Email: [email protected]

Nonparametric Statistics ISSN 1048-5252 print/ISSN 1029-0311 online © 2005 Taylor & Francis http://www.tandf.co.uk/journals DOI: 10.1080/10485250500450808

996

Y. Zhao

For a study time τ suppose that, conditionally on the covariate, the survival rate at τ of the ith patient, πi = P (Ti ≥ τ |Zi ), satisfies φ(πi ) = β  Zi ,

(1)

where φ is a known, monotonic, and differentiable function in [0, 1], and β is a (p + 1) × 1 vector of unknown regression parameters. Some choices of the link function φ lead to well-known models in survival analysis. For example, the link function for the logistic regression model is φ(π ) = log(π/(1 − π)). A regression model with link function φ(π ) = − log(− log(π )) for a certain time period τ corresponds to the Cox’s [1] proportional hazard model over the entire time span. Our approach is based on the empirical likelihood (EL) method. The EL method is a powerful non-parametric method. It holds some unique features, such as range respecting, transformation-preserving, asymmetric confidence interval, Bartlett correctability, and better coverage probability for small sample [cf. 2–5]. The use of EL in survival analysis traces back to Thomas and Grunkemeier [6] who derived pointwise confidence intervals for survival function with right censored data [7, 8]. For right-censoring data, this approach has been used in the construction of simultaneous confidence bands under a variety of setting; see refs. [9–12] among others. Owen [13, 14] introduced EL confidence regions for the mean of a random vector based on i.i.d. complete data. Since then, the EL has been widely applied to statistical models to do inference for the parameter of interest. Some related works include linear model [15–17], linear model for missing data [18, 19] generalized linear models [20, 21], partial linear regression model [22, 23], linear regression model with right censored data [24, 25], the Cox proportional hazard model with right censoring [26], the additive risk model with right censoring [27], and partial linear regression with right censoring [28, 29], among others. A comprehensive survey of EL appears in the recent book ‘Empirical Likelihood’ by Owen [30]. Jung [31] studied the above regression model in which the censoring is independent of the covariate or that Z takes only discrete values. However, to the best of our knowledge the confidence region for the regression parameter for predicting long-term survival rate has not yet been developed via the EL method. In this paper, we consider the regression model (1), make full use of the estimating equation of Jung [31], and find one EL confidence region for the unknown regression parameter. Moreover, the proposed confidence region is adapted to the data set and not necessarily symmetric. Thus, it reflects the nature of the underlying data and hence gives a more representative way to make inferences about the parameter of interest. The rest of the paper is organized as follows. The proposed confidence region and main asymptotic result are presented in section 2. The corresponding constrained maximization of the EL can be done reliably by the Newton–Raphson method. In section 3, we compare the proposed approach to the traditional normal approximation (NA) based method. The simulation results suggest that the EL based method is more accurate in terms of coverage probability for small sample sizes and heavy censoring rate. In section 4, we illustrate the proposed method with the real data. Proofs are contained in Appendix A.

2. 2.1

Main results Preliminaries

We consider the regression model (1) for the long-term survival rate. When the censoring variable Ci is assumed to be independent of Ti , it is easy to show E{I (Xi ≥ t)|Zi } = P (Ti ≥ t|Zi )G(t),

(2)

Regression analysis for long-term survival rate

997

where G(t) is the survival function of C. From (2), Jung [31] proposed the following estimation equation: U1 (β) =:

 n   I (Xi ≥ τ ) πφ (β  Zi ) Zi = 0, − π(β  Zi ) ˆ ) π(β  Zi )π¯ (β  Zi ) G(τ

(3)

i=1

ˆ is the Kaplan– where π is the inverse function of φ, π¯ = 1 − π, πφ (φ) = ∂π(φ)/∂φ, and G(t)  Meier estimator of G(t). Note that π(β0 Zi ) = P (Ti ≥ τ |Zi ) = πi . When the censoring variable depends on the covariate vector Z and Z takes only discrete values, the alternative estimating equation is as follows:  n   I (Xi ≥ τ ) πφ (β  Zi )  U2 (β) =: − π(β Zi ) Zi = 0, ˆ π(β  Zi )π¯ (β  Zi ) G(τ, Zi )

(4)

i=1

ˆ z) is an estimator of G(t, z) = P (C > t|Z = z), the conditional survival function where G(t, of C given Z, obtained by assuming a Cox proportional hazard model for the censoring C. See the discussion in ref. [31]. ˆ Under mild conditions, both estimating equations (3) and (4) have a unique solution β. Let n   = lim n−1 ξi ξi , n→∞

i=1

where 

 I (Xi ≥ τ ) πφ (β0 Zi )  − π(β0 Zi ) Zi ξi = G(τ ) π(β0 Zi )π¯ (β0 Zi )  τ + h(β0 ) y(s)−1 (dNic (s) − Yi (s)d c (s)), −∞

with Nic (s) = (1 − δi )I (Xi ≤ s), Yi (s) = I (Xi ≥ s), c = − log G, and   n  Zi I (Xi ≥ τ )πφ (β  Zi ) −1 −1 , h(β) = lim (G(τ )) n n→∞ (φ(β  Zi )π¯ (β  Zi )) i=1 y(s) = lim n−1 n→∞

n 

I (Xi ≥ s).

i=1

In the following, we only consider the case where the censoring variables are i.i.d. which are independent of the covariates. The case where the censoring variable is dependent on a covariate which can take finitely many values can be handled by extending results for the ˆ we need the following independent case [31, 32]. To derive the asymptotic normality of β, conditions [cf. ref. 31]. Suppose β0 is the true value of β. Let D be a bounded convex region. The true value β0 of β is in its interior. In addition, we assume the following conditions hold: C.1. The covariate vector Z is bounded, i.e., Z ≤ M for some positive constant M, where  ·  is Euclidean norm. C.2. There exists a constant τ not dependent on β ∈ D, such that P (X ≥ τ |Z = z) > 0 for z a.s. C.3. The function π is twice differentiable with respect to φ, the second derivative πφφ (·) is bounded, and 0 < π(·) < 1 in a bounded domain.

998

Y. Zhao

C.4. The matrix A is positive definite, where  ZZ  (πφ (β0 Z))2 . A=E {π(β0 Z)π¯ (β0 Z)} 

Under conditions C.1–C.4, it is shown in the Appendix of ref. [31] D n1/2 (βˆ − β0 ) → N (0, A−1 A−1 ).

(5)

Let ˆ = n−1

n 

ξˆi ξˆi ,

i=1

where ξˆi =



 I (Xi ≥ τ ) πφ (βˆ  Zi )  ˆ − π(β Zi ) Zi ˆ ) π(βˆ  Zi )π¯ (βˆ  Zi ) G(τ   n   ˆ Z I (X ≥ τ )π ( β Z ) 1 j i φ j  + ˆ ) ¯ βˆ  Zj ) φ(βˆ  Zj )φ( G(τ  ×

j =1

 n (1 − δi )I (Xi ≤ τ )  (1 − δi )I (Xi ≤ Xi ∧ τ )

n

. − ( nj=1 I (Xj ≥ Xl ))2 j =1 I (Xj ≥ Xi ) l=1

Let Aˆ = n−1



(πφ (βˆ  Zi ))2 Zi Zi . π(βˆ  Zi )π¯ (βˆ  Zi )

ˆ Similarly as Lemma A1, we can show From Lemma A1,  is consistently estimated by . ˆ Thus, an asymptotic 100(1 − α)% confidence region that A is consistently estimated by A. for β based on the above NA is given by 2 ˆ βˆ − β) ≤ χp+1 (α)}, R1 = {β : n(βˆ − β) Aˆ ˆ −1 A( 2 (α) is the upper α-quantile of the chi-square distribution with degrees of freedom where χp+1 p + 1.

2.2

EL confidence region

Now consider an alternative approach based on EL. For 1 ≤ i ≤ n, we define 

 I (Xi ≥ τ ) πφ (β0 Zi )  Wi = − π(β0 Zi ) Zi , G(τ ) π(β0 Zi )π¯ (β0 Zi )   I (Xi ≥ τ ) πφ (β0 Zi ) − π(β0 Zi ) Wni = Zi . ˆ ) π(β0 Zi )π¯ (β0 Zi ) G(τ

Regression analysis for long-term survival rate

999

From equation (2) it is easy to check that EWi = 0. Then, the EL is given by L(β0 ) = sup

 n 

pi :



pi = 1,

i=1

n 

 pi Wi = 0, pi ≥ 0, i = 1, . . . , n .

i=1

However, the Wi ’s depend on G(·) which is unknown; thus we replace them by the Wni ’s. Therefore, using the notation Ln , an estimated EL at the true value β0 is given by Ln (β0 ) = sup

 n 

pi :



pi = 1,

i=1

n 

 pi Wni = 0, pi ≥ 0, i = 1, . . . , n .

i=1

Let p= (p1 , . . . , pn ) be a probability vector, i.e., ni=1 pi = 1 and pi ≥ 0 for 1 ≤ i ≤ n. Note that ni=1 pi attains its maximum at pi = 1/n. Thus, the empirical likelihood ratio (ELR) at the true value β0 is defined by R(β0 ) = sup

 n 

npi :



pi = 1,

i=1

n 

 pi Wni = 0, pi ≥ 0, i = 1, . . . , n .

i=1

By using Lagrange multipliers, we know that R(β0 ) is maximized when pi =

1 {1 + λ Wni }−1 , n

i = 1, . . . , n,

where λ = (λ1 , . . . , λp+1 ) satisfies the equation 1 Wni = 0. n i=1 1 + λ Wni n

(6)

The value of λ may be found by numerical search (e.g., Newton–Raphson method); see the discussion in ref. [30]. Thus combining the above equalities, we have −2 log R(β0 ) = 2

n 

log{1 + λ Wni },

(7)

i=1

where λ = (λ1 , . . . , λp+1 ) satisfies the equation (6). Let 1 = lim n−1 n→∞

 2 n   I (Xi ≥ τ ) πφ (β0 Zi ) Zi Zi . − π(β0 Zi )   G(τ ) π(β Z ) π ¯ (β Z ) 0 i 0 i i=1

Now we state our main result and explain how it can be used to build a confidence region for β. 2 THEOREM 2.1 Under conditions C.1–C.4, −2 log R(β0 ) converges in distribution to r1 χ1,1 + 2 2 2 · · · + rp+1 χp+1,1 , where χ1,1 , . . . , χp+1,1 are independent chi-square random variables with 1 degree of freedom and r1 , . . . , rp+1 are the eigenvalues of 1−1 .

1000

Y. Zhao

Theorem 2.1 will be proved in Appendix A. We note that the limiting distribution of the ELR 2 is a weighted sum of i.i.d. χ12 ’s instead of the standard χp+1 distribution, where the weights can be consistently estimated. Because the Wni ’s are dependent, −2 log R(β0 ) is no longer a sum of standard independent random variables. The similar phenomenon has occurred in various contexts, such as Li and Wang [25], Qin and Jing [24], Wang and Jing [33], and Wang and Wang [34], among others. From Lemma A1 (i), 1 is consistently estimated by ˆ 1 = n

−1

 2  n  I (Xi ≥ τ ) πφ (βˆ  Zi )  − π(βˆ Zi ) Zi Zi . ˆ ) π(βˆ  Zi )π¯ (βˆ  Zi ) G(τ i=1

ˆ Hence, the ri ’s can be estimated by the rˆi ’s which are the eigenvalues of ˆ 1−1 . Thus, an asymptotic 100(1 − α)% EL confidence region for β is given by R2 = {β : −2 log R(β) ≤ c(α)}, 2 2 where c(α) is the upper α-quantile of the distribution of rˆ1 χ1,1 + · · · + rˆp+1 χ1,p+1 and can be obtained by simulation method. The proposed EL method has some limitations. For instance, it is not very practical to compute an EL confidence region when the dimension is high. The proposed EL is also limited to making inference for linear combination of the regression coefficients such as a single coefficient, a subset of coefficients, and linear contrasts. Profiling the nuisance parameters involves constrained optimization which may not be easy in high dimensional cases. Instead of using a profile likelihood, Li and Wang [25] proposed an adjusted EL for linear combinations of β for the linear regression model with right censoring. They replaced the nuisance parameters by their least squares estimators and derived the limiting distribution of the adjusted EL. However, the structure of the regression model (1) is quite different from the linear regression model and the calculation of the corresponding ELR involves non-linear terms. We have been unable to extend the approach to the case for the linear combinations of regression coefficients. Further investigation about this issue will be carried out in the future.

3.

Simulation study

In this section, we conduct a simulation study and report some numerical results. We compare the performance of the proposed EL confidence region with the NA based confidence region in terms of coverage probability. The conditional distribution of T given Z was taken to be L(0, π 2 /3z22 ), where L(0, σ 2 ) denotes a logistic distribution with mean zero and variance σ 2 . This implies that π(t) = 1/ (1 + ez2 t ). Note that φ(π(t)) = −z2 t, with φ(π) = log{π/(1 − π )}, gives a regression model for the long-term survival rate. Thus, the true values of the intercept and the slope are 0, −τ , respectively. In the first model, we use a discrete random variable Z = (1, Z2 ) with Z2 taking values 1.1, 1.3, 1.5, 1.7, and 1.9 each with probability 0.2. The conditional censoring distribution of C given Z is taken to be log(Uniform(0, cz2 )), and the value of c determines the censoring rate (CR). Note that in this case, T and C are independent given Z. Let τ be 0.4. It is easy to check that conditions C.1–C.4 hold for the well-known link function φ(π ) = log{π/(1 − π )}. In the second model, we use Z = Uniform(0, 1). The censoring distribution of C is independent of the covariate Z and is taken to be log(Uniform(0, c)), and the value of c controls

Regression analysis for long-term survival rate Table 1.

Coverage probabilities for the regression parameter in model 1. 1 − α = 0.90

CR(%) 20

30

45

1001

1 − α = 0.95

1 − α = 0.99

n

NA

EL

NA

EL

NA

EL

50 100 150 200 50 100 150 200 50 100 150 200

0.856 0.864 0.881 0.889 0.844 0.851 0.873 0.878 0.765 0.802 0.824 0.823

0.867 0.871 0.887 0.891 0.853 0.869 0.879 0.882 0.789 0.844 0.858 0.861

0.912 0.927 0.938 0.937 0.905 0.913 0.935 0.927 0.833 0.857 0.878 0.901

0.928 0.936 0.954 0.955 0.920 0.928 0.946 0.939 0.857 0.906 0.921 0.928

0.958 0.976 0.982 0.985 0.971 0.967 0.983 0.981 0.894 0.926 0.954 0.964

0.966 0.981 0.985 0.985 0.978 0.978 0.989 0.986 0.917 0.965 0.982 0.987

the CR. We choose τ to be 1. It is easy to check that conditions C.1–C.4 hold for the link function φ(π ) = log{π/(1 − π)}. We take 0.90, 0.95, and 0.99 as the nominal confidence level 1 − α, respectively. Let c be three different values to obtain 20%, 30%, and 45% CRs, respectively, which represent light censoring, medium censoring, and heavy censoring. The sample size n is chosen to be 50, 100, 150, and 200, respectively. The coverage probabilities of the NA based method and the EL method are estimated from 1000 simulated data sets. The simulation results for model 1 and model 2 are reported in table 1 and table 2, respectively. From the tables, we make the following observations. 1. At each nominal level, the coverage accuracies for EL and NA methods in both models decrease as the CRs increase, and increase as the sample size increases. 2. The coverage probabilities for the NA method and EL method are consistently lower than the nominal level for small sample (n = 50). 3. The EL outperforms the NA method in both models. In particular, under heavy CR (CR = 45%), the EL confidence region has more accurate coverage probabilities than the NA based confidence region.

Table 2.

Coverage probabilities for the regression parameter in model 2. 1 − α = 0.90

CR(%) 20

30

45

n 50 100 150 200 50 100 150 200 50 100 150 200

1 − α = 0.95

1 − α = 0.99

NA

EL

NA

EL

NA

EL

0.873 0.877 0.891 0.892 0.862 0.861 0.863 0.867 0.704 0.740 0.747 0.764

0.879 0.886 0.898 0.893 0.871 0.860 0.869 0.865 0.756 0.795 0.809 0.820

0.932 0.939 0.954 0.945 0.923 0.922 0.923 0.929 0.790 0.821 0.836 0.844

0.939 0.938 0.949 0.946 0.927 0.927 0.931 0.933 0.831 0.865 0.885 0.892

0.976 0.980 0.987 0.987 0.971 0.979 0.983 0.982 0.877 0.905 0.926 0.925

0.977 0.981 0.987 0.988 0.978 0.982 0.985 0.984 0.934 0.961 0.968 0.966

1002

Y. Zhao

From tables 1 and 2, we find that the NA based method does not always work well. One reason is that the NA based confidence region needs to estimate  and A (cf. section 2.1). The variance estimates are not very stable and may contain values outside their ranges. In summary, our simulation study shows that the proposed EL method gives competitive coverage probabilities and suggests that the EL improves the coverage in the cases considered.

4.

Example

In this section, we illustrate the proposed EL method and compare it with the NA method using the multiple myeloma data. The data set can be found in SAS/STAT User’s guide (1999, pp. 2608–2617, 2536–2641). The data come from a study on multiple myeloma in which researchers treated 65 patients with alkylating agents. Of those patients, 48 died during the study and 17 survived. The CR is about 26%. For illustration, one covariate Z, namely the logarithm of blood urea nitrogen, log(BUN) is considered for the regression analysis of τ month survival rate. It is important to ensure that the chosen value of τ does not result in an empty at-risk set, i.e., condition C.2 is satisfied. The prespecified value of τ may be compared with some benchmark value such as the median survival time, to ensure that τ is not too small or too large. We choose three different τ values 20, 25, 30 months in the data set. Using the multiple myeloma data, we construct confidence intervals for the regression parameter, which makes the comparison between EL method and NA method more interesting and meaningful. Thus, for πi = P (Ti ≥ τ ), we consider the following logistic regression model without intercept,  log

πi 1 − πi

 = βZ,

and obtain confidence intervals for β with the NA based method and the EL method for 90% and 95% nominal confidence levels. Table 3 provides the point estimate of β, confidence interval for β, and length of confidence interval at different values of τ . From table 3, we see that for the multiple myeloma data the EL confidence intervals are comparable to the NA confidence intervals. As expected, the estimated EL produces slightly wider confidence intervals and is conservative. It is well known that the traditional NA confidence interval has the symmetry property. This is not a desirable property, as the distribution of the parameter estimator may be skewed. However, the EL confidence intervals are asymmetric about the point estimator. They are shifted to the left or right compared with the NA confidence intervals. Thus, one advantage of the EL method is that it is able to pick up possible skewness in the underlying distribution of the parameter estimate. Table 3.

Regression analysis for the multiple myeloma data: point estimate of β, confidence interval, and confidence interval length. 1 − α = 0.90

τ

β

NA

1 − α = 0.95 EL

NA

EL

20 −0.444 (−0.709, −0.179) 0.530 (−0.712, −0.179) 0.533 (−0.760, −0.128) 0.632 (−0.767, −0.125) 0.642 25 −0.494 (−0.765, −0.223) 0.542 (−0.772, −0.222) 0.550 (−0.818, −0.170) 0.648 (−0.826, −0.170) 0.656 30 −0.582 (−0.868, −0.296) 0.572 (−0.876, −0.298) 0.578 (−0.924, −0.240) 0.684 (−0.935, −0.242) 0.693

Regression analysis for long-term survival rate

1003

Acknowledgements I would like to thank the associate editor, and the two referees for careful reading of the manuscript, helpful comments, and valuable remarks. References [1] Cox, D.R., 1972, Regression models and life tables (with discussion). Journal of the Royal Statistical Society B, 34, 187–220. [2] DiCiccio, T.J., Hall, P. and Romano, J., 1991, Empirical likelihood is Bartlett-correctable. Annals of Statistics, 19, 1053–1061. [3] DiCiccio, T.J. and Romano, J.P., 1989, On adjustments based on the signed root of the empirical likelihood ratio statistic. Biometrika, 76, 447–456. [4] DiCiccio, T.J. and Romano, J.P., 1990, Non-parametric confidence-limits by resampling methods and least favorable families. International Statistical Review, 58, 59–76. [5] Hall, P., 1990, Pseudo-likelihood theory for empirical likelihood. Annals of Statistics, 18, 121–140. [6] Thomas, D.R. and Grunkemeier, G.L., 1975, Confidence interval estimation for survival probabilities for censored data. Journal of American Statistical Association, 70, 865–871. [7] Li, G., 1995, On nonparametric likelihood ratio estimation of survival probabilities for censored data. Statistics and Probability Letters, 25, 95–104. [8] Murphy, S., 1995, Likelihood ratio-based confidences in survival analysis. Journal of American Statistical Association, 90, 1399–1405. [9] Hollander, M., McKeague, I.W. and Yang, J., 1997, Likelihood ratio-based confidence bands for survival functions. Journal of American Statistical Association, 92, 215–226. [10] Einmahl, J.H. and McKeague, I.W., 1999, Confidence tubes for multiple quantile plots via empirical likelihood. Annals of Statistics, 27, 1348–1367. [11] Li, G. and Van Keilegom, I., 2002, Likelihood ratio confidence bands in nonparametric regression with censored data. Scandinavian Journal of Statistics, 29, 547–562. [12] McKeague, I.W. and Zhao, Y., 2002, Simultaneous confidence bands for ratios of survival functions via empirical likelihood. Statistics and Probability Letters, 60, 405–415. [13] Owen, A.B., 1988, Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, 237–249. [14] Owen, A.B., 1990, Empirical likelihood and confidence regions. Annals of Statistics, 18, 90–120. [15] Owen, A.B., 1991, Empirical likelihood for linear models. Annals of Statistics, 19, 1725–1747. [16] Chen, S.X., 1993, On the accuracy of empirical likelihood confidence regions for linear regression model. Annals of the Institute of Statistical Mathematics, 45, 621–637. [17] Chen, S.X., 1994, Empirical likelihood confidence intervals for linear regression coefficients. Journal of Multivariate Analysis, 49, 24–40. [18] Wang, Q.H. and Rao, J.N.K., 2001, Empirical likelihood for linear regression models under imputation for missing responses. Canadian Journal of Statistics, 29, 597–608. [19] Wang, Q.H. and Rao, J.N.K., 2002, Empirical likelihood-based inference in linear models with missing data. Scandinavian Journal of Statistics, 29, 563–576. [20] Kolaczyk, E.D., 1994, Empirical likelihood for generalized linear models. Statistica Sinica, 4, 199–218. [21] Chen, S.X. and Cui, H., 2003, An extended empirical likelihood for generalized linear models. Statistica Sinica, 13, 69–81. [22] Wang, Q.H. and Jing, B.Y., 1999, Empirical likelihood for partial linear models with fixed designs. Statistics and Probability Letters, 41, 425–433. [23] Shi, J. and Lau, T.S., 2000, Empirical likelihood for partially linear models. Journal of Multivariate Analysis, 72, 132–148. [24] Qin, G. and Jing, B.Y., 2001, Empirical likelihood for censored linear regression. Scandinavian Journal of Statistics, 28, 661–673. [25] Li, G. and Wang, Q.H., 2003, Empirical likelihood regression analysis for right censored data. Statistica Sinica, 13, 51–68. [26] Qin, G. and Jing, B.Y., 2001, Empirical likelihood for Cox regression model under random censorship. Communication in Statistics, Simulation and Computation, 30, 79–90. [27] Zhao, Y. and Hsu, Y.S., 2005, Semiparametric analysis for additive risk model via empirical likelihood. Communication in Statistics, Simulation and Computation, 34, 135–143. [28] Qin, G. and Jing, B.Y., 2001, Censored partial linear models and empirical likelihood. Journal of Multivariate Analysis, 78, 37–61. [29] Wang, Q.H. and Li, G., 2002, Empirical likelihood semiparametric regression analysis under random censorship. Journal of Multivariate Analysis, 83, 469–486. [30] Owen, A.B., 2001, Empirical Likelihood (New York: Chapman & Hall/Crc). [31] Jung, S., 1996, Regression analysis for long-term survival rate. Biometrika, 83, 227–232. [32] Subramanian, S., 2001, Parameter estimation in regression for long-term survival rate from censored data. Journal of Statistical Planning and Inference, 99, 211–222.

1004

Y. Zhao

[33] Wang, Q.H. and Jing, B.Y., 2001, Empirical likelihood for a class of functionals of survival distribution with censored data. Annals of the Institute of Statistical Mathematics, 53, 517–527. [34] Wang, Q.H. and Wang, J.L., 2001, Inference for the mean difference in the two-sample random censorship model. Journal of Multivariate Analysis, 79, 295–315.

Appendix A: proof of Theorem 2.1 We need the following lemma in order to prove Theorem 2.1. LEMMA A1 Under the conditions of Theorem 2.1, we have

(i)

n 1 P Wni Wni → 1 , n i=1

P (ii) ˆ 1 → 1 ,

P (iii) ˆ → .

Proof of Lemma A.1 Let ˆ 1n =

1 Wni Wni n i=1 n

1 Wi Wi . n i=1 n

1n =

In order to prove (i), we only need to show ˆ 1n = 1n + OP (1). For any a ∈ R p+1 , the following decomposition holds: a  (ˆ 1n − 1n )a =

1  2  (a (Wni − Wi ))2 + (a Wi )(a  (Wni − Wi )) n i=1 n i=1 n

n

= I1 + 2I2 .

(A1)

First note that a  (Wni − Wi ) = a  =



I (Xi ≥ τ ) I (Xi ≥ τ ) − ˆ ) G(τ ) G(τ



πφ (β0 Zi )Zi π(β0 Zi )π¯ (β0 Zi )

ˆ )) πφ (β0 Zi )a  Zi I (Xi ≥ τ )(G(τ ) − G(τ . ˆ )G(τ ) π(β0 Zi )π¯ (β0 Zi ) G(τ

It follows from conditions C.1, C.3 and consistency of the Kaplan–Meier estimator that  n  ˆ )| 1  πφ (β0 Zi )a  Zi 2 |G(τ ) − G(τ · = OP (1), |I2 | ≤ C ˆ )G2 (τ ) n i=1 π(β0 Zi )π¯ (β0 Zi ) G(τ

(A2)

where C is a constant. Similarly we can show that I1 = OP (1). Thus by equations (A1) and (A2), we prove Lemma A1 (i).

Regression analysis for long-term survival rate

1005

In order to prove Lemma A2 (ii), we only need to show that ˆ 1 = ˆ 1n + OP (1). Let 

Ji =

 I (Xi ≥ τ ) πφ (βˆ  Zi ) − π(βˆ  Zi ) ˆ ) π(βˆ  Zi )π¯ (βˆ  Zi ) G(τ   I (Xi ≥ τ ) πφ (β0 Zi )  . − π(β0 Zi ) − ˆ ) π(β0 Zi )π¯ (β0 Zi ) G(τ

Applying the mean value theorem we obtain the following equalities: η1 =: πφ (βˆ  Zi ) − πφ (β0 Zi ) = πφφ (β0 Zi + ξ1 (βˆ  − β0 )Zi )(βˆ  − β0 )Zi , η2 =: π(βˆ  Zi )π( ¯ βˆ  Zi ) − π(β0 Zi )π(β ¯ 0 Zi ) = πφ (β0 Zi + ξ2 (βˆ  − β0 )Zi )(1 − 2π(β0 Zi + ξ2 (βˆ  − β0 )Zi ))(βˆ  − β0 )Zi , ˆ i ) − π(β ¯ 0 Zi ) η3 =: π¯ (βZ = π¯ φ (β0 Zi + ξ3 (βˆ  − β0 )Zi )(βˆ  − β0 )Zi , where 0 ≤ ξi ≤ 1 for i = 1, 2, 3. ˆ and equation (5) we Combining the above equalities, condition C.1, C.3, consistency of β, have    I (X ≥ τ ) I (Xi ≥ τ ) πφ (βˆ  Zi ) πφ (β0 Zi ) πφ (βˆ  Zi ) πφ (β0 Zi )  i  |Ji | = − + −   G(τ ˆ ) π(βˆ  Zi )π( ˆ ) π(β0 Zi )π¯ (β0 Zi ) π¯ (βˆ  Zi ) π¯ (β0 Zi )  ¯ βˆ  Zi ) G(τ      η π¯ (β  Z )π(β  Z ) − η π (β  Z )   η π¯ (β  Z ) − η π (β  Z )  1 2 φ 0 i 3 φ 0 i     1 0 i 0 i 0 i ≤ +   G(τ  ˆ )π(βˆ  Zi )π( π¯ (βˆ  Zi )π¯ (β0 Zi ) ¯ βˆ  Zi )π(β0 Zi )π(β ¯ 0 Zi )     1 = OP √ , (A3) n for 1 ≤ i ≤ n. Then for any a ∈ R p+1 , by conditions C.1, C.3, consistency of the Kaplan–Meier estimator, and (A3) we have |a  (ˆ 1 − ˆ 1n )a| ≤

1  2 2 (a Zi ) Ji n i=1 n

  n  I (Xi ≥ τ )  2  2 |πφ (β0 Zi )|   − π(β0 Zi ) + (a Zi ) |Ji |  ˆ ) n i=1 π(β0 Zi )π¯ (β0 Zi ) G(τ 1 2 J n i=1 i n

≤ sup (a  Zi )2 i≤i≤n

2 |Ji | n i=1 n

+ sup (a  Zi )2 1≤i≤n



1 ˆ ) G(τ

 +1

|πφ (β0 Zi )| π(β0 Zi )π¯ (β0 Zi )



1006

Y. Zhao

 ≤C

n

n

 



2

sup |Ji |

≤C = OP



1 2 1 J + |Ji | n i=1 i n i=1

1≤i≤n

+ sup |Ji | 1≤i≤n



1 . √ n

Therefore Lemma A1 (ii) follows. Following the same line as above, we can show Lemma A1 (iii).  The following lemma is Lemma 3 of Qin and Jing [24]. It is essential to derive Theorem 2.1. D

LEMMA A.2 Let Y → N (0, Ip+1 ), where Ip+1 is the (p + 1) × (p + 1) identity matrix. Suppose U is a (p + 1) × (p + 1) non-negative definite matrix with eigenvalues r1 , . . . , rp+1 . Then, P

2 2 Y  U Y → r1 χ1,1 + · · · + rp+1 χp+1,1 , 2 where χi,1 is defined in Theorem 2.1, for i = 1, . . . , p + 1.

Proof of Theorem 2.1 By conditions C.1, C.3, and consistency of the Kaplan–Meier estimator      1 πφ (β0 Zi )  max Zi  max Wni  ≤ max  +1   1≤i≤n 1≤i≤n ˆ ) π(β0 Zi )π¯ (β0 Zi )  1≤i≤n G(τ = OP (1).

(A4)

Let λ = ρθ where ρ ≥ 0 and θ  = 1. Recall ˆ 1n = 1 + OP (1) (cf. Lemma A1 (i)). Let σ1 > 0 be the smallest eigenvalue of 1 . Then, θ  ˆ 1n θ ≥ σ1 + OP (1). ByAppendix 2 of ref. [31], the asymptotic normality holds, i.e., n−1/2 Let ej be the unit vector in the j th coordinate direction. Thus,    n   p+1  n 1   1         ≤ (p + 1) e W W  ni  ni  j n   n  j =1 i=1  i=1

(A5)

n

= OP (n−1/2 ).

i=1

D

Wni → N (0, ).

(A6)

Then, it follows from equations (6), (A5), (A6), and the argument used in Owen [14] that λ = OP (n−1/2 ). Applying Taylor’s expansion to equation (7), we have  n   1 −2 log R(β0 ) = 2 λ Wni − (λ Wni )2 + rn , 2 i=1 where |rn | ≤ C

n  i=1

|λ Wni |3 in probability.

(A7)

(A8)

Regression analysis for long-term survival rate

1007

Hence, by equations (A4), (A7)  |rn | ≤ Cnλ3

3 max Wni 

1≤i≤n

= OP (n−1/2 ).

(A9)

Note that   n n 1 Wni 1 (λ Wni )2  0= 1 − λ = W W + ni ni n i=1 1 + λ Wni n i=1 1 + λ Wni  n  n n 1 1  Wni (λ Wni )2 1  Wni − = Wni Wni λ + . n i=1 n i=1 n i=1 1 + λ Wni By equations (6), (A4), (A7), (A10), and Lemma A1 (i), it follows that  n −1 n   λ= Wni Wni Wni + OP (n−1/2 ). i=1

(A10)

(A11)

i=1

By equation (A10), we have 0=

n  i=1

=

n 

λ Wni 1 + λ Wni (λ Wni ) −

i=1

n 

(λ Wni )2 +

i=1

n  (λ Wni )3 . 1 + λ Wni i=1

(A12)

Similarly as before by equations (A4) and (A7), n  (λ Wni )3 = OP (n−1/2 ). W 1 + λ ni i=1

(A13)

Combining equations (A12) and (A13) we have n 

(λ Wni )2 =

i=1

n 

λ Wni + OP (1).

(A14)

i=1

By equations (A8), (A9), (A11), (A14) and Lemma A1 (i), we have −2 log R(β0 ) =

n 

λ Wni + OP (1)

i=1



−1/2

= n 

= 

n 

  Wni

n

i=1 −1/2 −1/2

n

n  i=1

−1

 Wni

n 

−1  Wni Wni

i=1

(

1/2

n

−1/2

n 



Wni

+ OP (1)

i=1

 1−1  1/2 )



−1/2 −1/2

n

n 

 Wni

+ OP (1).

i=1

(A15)

D Recall  −1/2 (n−1/2 ni=1 Wni ) → N (0, Ip+1 ) using the appendix of June [31] and noting that  −1/2 1−1  1/2 and 1−1  have the same eigenvalues. Using Lemma A2 to re-express the limiting distribution of equation (A15) as the weighted sum of independent χ12 random variables, we complete the proof of Theorem 2.1.