REGRESSION ANALYSIS OF CLUSTERED CURRENT STATUS DATA UNDER THE ADDITIVE HAZARDS MODEL

2018-01-15 06:35LIUYuhuanWANGChengyong

數(shù)學(xué)雜志 2018年1期

LIU Yu-huan,WANG Cheng-yong

(1.School of Mathematics and Statistics,Wuhan University,Wuhan 430072,China)

(2.School of Mathematics and Computer Science,Hubei University of Arts and Science,Xiangyang 441053,China)

1 Introduction

Case I interval-censored failure time data or current status data arise in many areas including demographical studies,economics,medical studies,reliability studies and social sciences,see e.g.[1–4].By case I interval-censored data,we mean that the failure time of interest is not exactly observed but the observation on it is either left-or right-censored.A typical example of such data is given by a tumorigenicity study and in this case,the time to tumor onset is often of interest.However,it is usually not observable as the presence or absence of tumors in animals is usually known only at their death or sacrifice.In particular,clustered current status data are commonly encountered in biomedicine.

Many procedures were developed for regression analysis of interval-censored failure time data under various models.For example,Huang[3]developed the maximum likelihood approach for fitting the proportional hazards model to case I interval-censored data,Chen and Sun[5],Sun and Shen[6]discussed the same problem in the presence of clustering and competing risks,respectively.Hu and Xiang[7]considered the efficient estimation for semiparametric cuer models when one faces case II interval-censored data,Lin et al.[8],Chenand Sun[5]discussed the fitting of the additive hazards model to case I interval-censored data.However,these methods do not take the clustered data into account or assumes that the cluster size is completely random or noninformative,and it is well-known that this may not be true as the outcome of interest among individuals in a cluster may be associated with the size of the cluster.That is,we may have informative cluster sizes.In the following,we present one approach for the problem of the regression analysis of clustered current status data under the additive hazards model.

In the presence of informative cluster size,among others,Dunson et al.[9]proposed a Bayesian procedure that models the relationship between the failure times of interest and the cluster size through a latent variable.Williamson et al.[10]and Cong et al.[11]also considered the same problem and investigated a weighted score function(WSF)approach and a within-cluster resampling(WCR)procedure.However,it does not seem to exist an estimation procedure for regression analysis of clustered failure time data with informative cluster size under the additive hazards model framework and current status data.

The rest of the article is organized as follows.Section 2 proposes the model and some notations used in this paper.Section 3 gives the WCR method by using the inference procedure proposed by Lin et al.[8]under the additive hazards model for case I intervalcensored failure time data,and Section 4 presents some extensive simulation studies to assess the performance of the proposed approach.

2 Notation and Model

Leti=1,···,ndenote the independent clusters,andj=1,···,nidenote the subjects within thei-th cluster.For subjectjin thei-th cluster,fori=1,···,nandj=1,···,ni,letTijandCijdenote the failure time of interest and the censoring or observation time,and letZij(t)be ap-dimensional vector of covariates that may depend on timet.It is assumed that theTijmay be dependent for the subjects within the same cluster but are independent for subjects from different clusters.We assume thatTijis conditionally independent ofCijgivenZij(t).

We assume that the survival probabilities of individuals in a cluster depend on the size of that cluster.However,it just as noted in Cong et al.[11],the cause for cluster sizes being informative can be complicated and usually unknown,and some latent variables may implicitly affect the baseline hazard for each cluster and/or covariates.If cluster sizes are noninformative to survival,the usual marginal additive hazards model(see[12])is

whereβ0is the unknown vector ofp-dimensional regression coeffcient,ωiis the clusterspecific random effect to account for within-cluster correlation in clusteri,andλ0(t)is the unknown baseline hazard function.If cluster sizes are ignorable(noninformative to survival),the usual marginal additive hazards model is applicable,given by

For each(i,j),we de fi neNij(t)=I(Cij≤min(t,Tij)),δij=I(Cij≤Tij)andYij(t)=I(Cij≥t)and letλc(t)denote the hazard function of theCij’s.Also define

where Λ0(t)

Note thatMij(t)is a local square-integrable martingale with respect to the marginal filtration

3 A Method Based on the Within-Cluster Resampling Technique

When cluster sizes are informative,the estimates and inference based on equation(2.2)may be incorrect.To account for informative cluster sizes,this section will propose a method based on the within-cluster resampling(WCR)technique.The basic idea behind the WCR-based procedure is that one observation is randomly sampled with replacement from each of thenclusters using the WCR approach(refer to Hoffman et al.[13]).For this,we randomly sample one subject with replacement from each of thenclusters,and suppose that the resampling process is repeatedKtimes,whereKis a large fixed number.Letτdenote a known time for the length of study period,thek-th resampled data set denoted by{Ci,k,δi,k,Zi,k(t);i=1,···,n,0≤t≤τ},consists ofnindependent observations,which can be analyzed using model(2.2)for independent data set.DefineYi,k(t)=I(Ci,k≥t)andNi,k(t)=δi,kI(Ci,k≤t),for thek-th resampled data,the partial likelihood function is

and the partial likelihood score function and observed information matrix are

where

anda?b=1,a,aa′forb=0,1 and 2.The maximum partial likelihood estimator(refer tois the solution toUk(β)=0.Furthermore,Lin et al.[8]showed thatconverges in distribution to a zero-mean normal random vector with covariance matrix can be consistently estimated bynand sois consistent.

As it is known to all that sample mean can reduce the system error,after repeating this procedureKtimes,the WCR estimator forβ0can be constructed as the average of theKresample-based estimators,that is,

Under some regularity conditions,we can show thatconverges in distribution to a zero-mean normal random vector,and the covariance matrix can be consistently estimated by

The proof of this result is sketched in Appendix.It does not need some special software to implement the proposed method.One can just input the datai=1,···,n}into standard software for fitting the proportional hazards model with rightcensored data.

4 Simulation Study

In this section,we conduct some simulations to assess the finite sample performance of the methods developed in the previous section.In the study,the failure times were generated from model(2.1)withλ0(.)=2.The covariate process was assumed to be time independent for simplicity and generated from the Bernoulli distribution with success probabilityp=0.5.The censoring times were generated from the exponential distribution with mean 1/exp(βZi).The cluster sizes were randomly generated from uniform distributionU{2,3,4,5,6,7}regardless of the frailty values.Here we choseβ0=±0.5,±0.2 and 0.The censoring times were generated from the exponential distribution to achieve approximately 30%,40%,50%and 60%.

The results include the estimated bias(Bias)given by the average of the proposed estimates minus the true value,the sample standard deviation(SSE)of the proposed estimates,the average of the proposed estimates of the standard errors(SEE),and the empirical 95%coverage probabilities(CP).All results listed in the following table are based on 500 replications with the number of clustersn=200,300 andK=500.It can be seen from Table 1 that the proposed estimate seem to be unbiased,the proposed variance estimates also seem to be reasonable,and all estimates become better when the sample size increases.

Table 1:simulation results for estimates of β0

[1]Andersen P K,Gill R D.Cox’s regression model for counting processes:a large sample study[J].Ann.Stat.,1982,10:1100–1120.

[2]Jewell N P,van der Laan M.Generalizations of current status data with applications[J].Lifetime Data Anal.,1995,1:101–110.

[3]Huang J.Eきcient estimation for the proportional hazards model with interval censoring[J].Ann.Stat.,1996,24:540–568.

[4]Rossini A J,Tsiatis A A.A semiparametric proportional odds regression model for the analysis of current status data[J].J.Amer.Stat.Assoc.,1996,91:713–721.

[5]Chen L,Sun J.A multiple imputation approach to the analysis of current status data with the additive hazards model[J].Comm.Stat.The.Meth.,2009,38:1009–1018.

[6]Sun J,Shen J.Eきcient estimation for the proportional hazards model with competing risks and current status data[J].Canad.J.Stat.,2009,37:592–606.

[7]Hu T,Xiang L.Eきcient estimation for semiparametric cure models with interval-censored data[J].J.Multi.Anal.,2013,121:139–151.

[8]Lin D,Oakes D,Ying Z.Additive hazards regression with current status data[J].Biom.,1998,85:289–298.

[9]Dunson D B,Chen Z,Harry J.Bayesian joint models of cluster size and subunitspecific outcomes[J].Biom.,2003,63:663–672.

[10]Williamson J,Kim H Y,Manathuga A,Addiss D G.Modeling survival data with informative cluster size[J].Stat.Med.,2008,27:543–555.

[11]Cong X,Yin G,Shen Y.Marginal analysis of correlated failure time data with informative cluster sizes[J].Biom.,2007,63:663–672.

[12]Lin D,Ying Z.Semiparametric analysis of the additive risk model[J].Biom.,1994,81:61–71.

[13]Hoffman E B,Sen P K,Weinberg C R.Within cluster resampling[J].Biom.,2001,88:1121–1134.

[14]Xiao Z,Zhu Q.Moderate deviation of maximum likelihood estimators for truncated and censored data[J].J.Math.,2009,29(3):273–278.

Appendix:proofs of asymptotic normality of

We first assume that 1/nuniformly converge toκ(t),π(t),and，respectively.Fori=1,···,n;j=1,···,niand some constantτ,we assume thatP{Yij(t)=1,0≤t≤τ}＞0,(t)is bounded and the cluster sizes are finite.

whereβξis on the line segment betweenandβ0.Rewriting(5.1)yields that

Note that

Averaging overk=1,···,Kresamples,it yields

It is suffcient to show thatconverges to a normal distribution asn→∞,changing the order of summation yields that

whereUi(β0),i=1,···,nare independent with zero mean and finite variance.By the multivariate central limit theorem,is asymptotically normal with zero mean and some positive definite covariance matrix.Combining with Slutsky’s theorem,converges in distribution to a normal random vector with zero mean and

denote the consistent estimator of the covariance matrix by

To obtain the consistent estimator of the covariance matrix,it is similar to Hoffman et al.[13],we first write

where the expectations on the right-hand side are over the resampling distribution forgiven the data.By the fact of Eit yields that

it can be estimated as the covariance matrix based on theKresamples estimators?βk,that is

Thus the estimated variance-covariance matrix ofis

To show the consistency of?Σwcr,it suffces to show that ??E(?)→0 in probability asn→∞.Actually,by applying the same arguments as those in the proof of Cong et al.[11],it can be shown that???E(??)→0 in probability asn→∞.This completes the proof.

數(shù)學(xué)雜志2018年1期

數(shù)學(xué)雜志的其它文章: A NOTE ON HILBERT TRANSFORM OF A CHARACTERISTIC FUNCTION; THE NEHARI MANIFOLD FOR A QUSILINEAR SUB-ELLIPTIC EQUATION WITH A SIGN-CHANGING WEIGHT FUNCTION ON THE HEISENBERG GROUP; NOTES ON STRONGLY SEPARABLE EXTENSIONS; EXISTENCE OF SOLUTIONS TO THE INITIAL VALUE PROBLEM OF SEMI-LINEAR GENERALIZED TRICOMI EQUATION; SOME PROPERTIES IN THE GENERALIZED MORREY SPACES ON HOMOGENOUS CARNOT GROUPS; A NONTRIVIAL PRODUCT OFIN THE COHOMOLOGY OF THE STEENROD ALGEBRA

国产日韩欧美一区二区三区三州_亚洲少妇熟女av_久久久久亚洲av国产精品_波多野结衣网站一区二区_亚洲欧美色片在线91_国产亚洲精品精品国产优播av_日本一区二区三区波多野结衣 _久久国产av不卡

REGRESSION ANALYSIS OF CLUSTERED CURRENT STATUS DATA UNDER THE ADDITIVE HAZARDS MODEL

1 Introduction

2 Notation and Model

3 A Method Based on the Within-Cluster Resampling Technique

4 Simulation Study