Parametric proportional hazards accelerated failure time models Göran Broström October 11, 2017 Abstract A unified implementation of parametric proportional hazards PH) accelerated failure time AFT) models for right-censored or intervalcensored left-truncated data is described. The description here is vaĺid for time-constant covariates, but the necessary modifications for hling time-varying covariates are implemented in eha. Note that only piecewise constant time variation is hled. 1 Introduction There is a need for software for analyzing parametric proportional hazards PH) accelerated failure time AFT) data, that are right or interval censored left truncated. 2 The proportional hazards model We define proportional hazards models in terms of an expansion of a given survivor function S 0, s θ t; z) = {S 0 gt, θ))} expzβ), 1) where θ is a parameter vector used in modeling the baseline distribution, β is the vector of regression parameters, g is a positive function, which helps defining a parametric family of baseline survivor functions through St; θ) = S 0 gt, θ) ), t > 0, θ Θ. 2) 1
With f 0 h 0 defined as the density hazard functions corresponding to S 0, respectively, the density function corresponding to S is ft; θ) = St, θ) t = t S 0gt, θ)) = g t t, θ)f 0 gt, θ)), where g t t, θ) = gt, θ). t Correspondingly, the hazard function is ft; θ) ht; θ) = St; θ) = g t t, θ)h 0 gt, θ)). 3) So, the proportional hazards model is λ θ t; z) = ht; θ) expzβ) = g t t, θ)h 0 gt, θ)) expzβ), 4) corresponding to 1). 2.1 Data the likelihood function Given left truncated right or interval censored data s i, t i, u i, d i, z i ), i = 1,..., n the model 4), the likelihood function becomes L θ, β); s, t, u, d), Z ) = n { hti ; θ) expz i β) ) I {di =1} St i ; θ) expz iβ) ) I {di 2} St i ; θ) expz iβ) Su i ; θ) expz iβ) ) I {di =2} Ss i ; θ) expz iβ) } Here, for i = 1,..., n, s i < t i u i are the left truncation exit intervals, respectively, d i = 0 means that t i = u i right censoring at u i, d i = 1 means that t i = u i an event at u i, d i = 2 means that t i < u i an event occurs in the interval t i, u i ) interval censoring) z i = z i1,..., z ip ) is a vector of explanatory variables with corresponding parameter vector β = β 1,..., β p ), i = 1,..., n. 2 5)
From 5) we now get the log likelihood the score vector in a straightforward manner. l θ, β); s, t, u, d), Z ) = { log hti ; θ) + z i β } e z iβ log Su i ; θ) log { St i ; θ) ez i β Su i ; θ) ez i β } e ziβ log Ss i ; θ) in the following we drop the long argument list to l), for the regression parameters β, l = β j e ziβ log St i ; θ) e z iβ St i; θ) eziβ e ziβ log Ss i ; θ), j = 1,..., p, for the baseline parameters θ, in vector form, θ l = h θ t i, θ) ht i, θ) From 3), e z iβ S θt i ; θ) St i ; θ) log St i ; θ) Su i ; θ) ez i β log Su i ; θ) St i ; θ) ez i β Su i ; θ) ez i β e z iβ St i; θ) eziβ 1 S θ t i ; θ) Su i ; θ) eziβ 1 S θ u i ; θ) St i ; θ) ez i β Su i ; θ) ez i β e z iβ S θs i ; θ) Ss i ; θ). h θ t, θ) = ht, θ) θ = g tθ t, θ)h 0 gt, θ)) + g t t, θ)g θ t, θ)h 0gt, θ)), 3 6) 7) 8) 9)
, from 2), S θ t; θ) = St; θ) = θ θ S ) 0 gt, θ) ) 10) = g θ t, θ)f 0 gt, θ). For estimating stard errors, the observed information the negative of the hessian) is useful. However, instead of the error-prone tedious work of calculating analytic second-order derivatives, we will rely on approximations by numerical differentiation. 3 The shape scale families From 1) we get a shape scale family of distributions by choosing θ = p, λ) ) p t gt, p, λ)) =, t 0; p, λ > 0. λ However, for reasons of efficient numerical optimization normality of parameter estimates, we use the parametrisation p = expγ) λ = expα), thus redefining g to gt, γ, α)) = ) expγ) t, t 0; < γ, α <. 11) expα) For the calculation of the score hessian of the log likelihood function, we need some partial derivatives of g. They are found in an appendix. 3.1 The Weibull family of distributions The Weibull family of distributions is obtained by leading to S 0 t) = expt), t 0, f 0 t) = expt), t 0, h 0 t) = 1, t 0. We need some first second order derivatives of f h, they are particularly simple in this case, for h they are both zero, for f we get f 0t) = expt), t 0. 4
3.2 The EV family of distributions The EV Extreme Value) family of distributions is obtained by setting leading to h 0 t) = expt), t 0, S 0 t) = exp{expt) 1)}, t 0, The rest of the necessary functions are easily derived from this. 3.3 The Gompertz family of distributions The Gompertz family of distributions is given by ht) = τ expt/λ), t 0; τ, λ > 0. This family is not directly possible to generate from the described shape-scale models, so it is treated separately by direct maximum likelihood. 3.4 Other families of distributions Included in the eha package are the lognormal the loglogistic distributions as well. 4 The accelerated failure time model In the description of this family of models, we generate a scape-scale family of distributions as defined by the equations 2) 11). We get ) St; γ, α)) = S 0 gt, γ, α)) = S 0 { t expα) } expγ) ), t > 0, < γ, α <. 12) Given a vector z = z 1,..., z p ) of explanatory variables a vector of corresponding regression coefficients β = β 1,..., β p ), the AFT regression model is defined by ) St; γ, α, β)) = S 0 gt expzβ), γ, α)) { } expγ) ) t expzβ) = S 0 expα) { } expγ) ) 13) t = S 0 expα zβ) = S 0 gt, γ, α zβ)) ), t > 0. 5
So, by defining θ = γ, α zβ), we are back in the framework of Section 2. We get ft; θ) = g t t, θ)f 0 gt, θ)) ht; θ) = g t t, θ)h 0 gt, θ)), 14) defining the AFT model generated by the survivor function S 0 corresponding density f 0 hazard h 0. 4.1 Data the likelihood function Given left truncated right or interval censored data s i, t i, u i, d i, z i ), i = 1,..., n the model 14), the likelihood function becomes L γ, α, β); s, t, d), Z ) n { = hti ; θ i ) I {d i =1} St i ; θ i ) I {i 2} St i ; θ i ) Su i ; θ i ) ) I {di =2} Ss i ; θ i ) 1} 15) Here, for i = 1,..., n, s i < t i u i are the left truncation exit intervals, respectively, d i = 0 means that t i = u i right censoring at t i, d i = 1 means that t i = u i an event at t i, d i = 2 means that t i < u i an event uccurs in the interval t i, u i ) interval censoring), z i = z i1,..., z ip ) is a vector of explanatory variables with corresponding parameter vector β = β 1,..., β p ), i = 1,..., n. From 15) we now get the log likelihood the score vector in a straightforward manner. l γ, α, β); s, t, u, d), Z ) = log ht i ; θ i ) log St i ; θ i ) log St i ; θ i ) Sui ; θ i )) log Ss i ; θ i ) in the following we drop the long argument list to l), for the regression parameters β, 6
l = h j t i, θ i ) β j ht i, θ i ) S j t i ; θ i ) St i ; θ i ) d i =1 d i 2 S j t i ; θ i ) S j u i ; θ i ) St i ; θ i ) Su i ; θ i ) S j s i ; θ i ) Ss d i =2 i ; θ i ) = h α t i, θ i ) ht i, θ i ) S α t i ; θ i ) St i ; θ i ) d i =1 d i 2 d i =2 S α t i ; θ i ) S α u i ; θ i ) St i ; θ i ) Su i ; θ i ) for the baseline parameters γ α, Here, from 3), γ l = α l = h γ t, θ i ) = γ ht, θ i) h γ t i, θ i ) ht i, θ i ) S γ t i ; θ i ) St i ; θ i ) + S γ t i ; θ i ) S γ u i ; θ i ) St i ; θ i ) Su i ; θ i ) h α t i, θ i ) ht i, θ i ) S α t i ; θ i ) St i ; θ i ) S α t i ; θ i ) S α u i ; θ i ) St i ; θ i ) Su i ; θ i ) S α s i ; θ i ) Ss i ; θ i ) S γ s i ; θ i ) Ss i ; θ i ), S α s i ; θ i ) Ss i ; θ i ). = g tγ t, θ i )h 0 gt, θ i )) + g t t, θ i )g γ t, θ i )h 0gt, θ i )), h α t, θ i ) = α ht, θ i) = g tα t, θ i )h 0 gt, θ i )) + g t t, θ i )g α t, θ i )h 0gt, θ i )), h j t, θ i ) = β j ht, θ i ) = α ht, θ i) β j α z i β) = h α t, θ i ), j = 1,..., p. 7
Similarly, from 2) we get S γ t; θ i ) = γ St; θ i) = γ S 0 gt, θi ) ) = g γ t, θ i )f 0 gt, θi ) ), S α t; θ i ) = α St; θ i) = α S 0 gt, θi ) ) = g α t, θ i )f 0 gt, θi ) ). S j t; θ i ) = St; θ i ) = β j α S 0 gt, θi ) ) α z i β) β j = S α t, θ i ), j = 1,..., p. For estimating stard errors, the observed information the negative of the hessian) is useful, so 2 l = { ) 2 } hαα t i, θ i ) hα t i, θ i ) z im β j β m ht i, θ i ) ht i, θ i ) { ) 2 } Sαα t i, θ i ) Sα t i, θ i ) z im St i, θ i ) St i, θ i ) i:i 2 { ) 2 } Sαα t i, θ i ) S αα u i, θ i ) Sα t i, θ i ) S α u i, θ i ) z im St i:i=2 i, θ i ) Su i, θ i ) St i, θ i ) Su i, θ i ) { ) 2 } Sαα s i, θ i ) Sα s i, θ i ) + z im, j, m = 1,..., p, Ss i, θ i ) Ss i, θ i ) 8
2 β j τ l = finally i:i 2 { hατ t i, θ i ) ht i, θ i ) h } αt i, θ i )h τ t i, θ i ) h 2 t i, θ i ) { Sατ t i, θ i ) St i, θ i ) S αt i, θ i )S τ t i, θ i ) S 2 t i, θ i ) { Sατ t i, θ i ) S ατ u i, θ i ) St i:i=2 i, θ i ) Su i, θ i ) Sα t i, θ i ) S α u i, θ i ) ) S τ t i, θ i ) S τ u i, θ i ) ) } Sti, θ i ) Su i, θ i ) ) 2 { Sατ s i, θ i ) Ss i, θ i ) S } αs i, θ i )S τ s i, θ i ) S 2 s i, θ i ) } j = 1,..., p; τ = γ, α, 2 ττ l = { hτ τt i, θ i ) h τ t } i, θ i )h τ t i, θ i ) ht i, θ i ) h 2 t i, θ i ) { Sτ τt i, θ i ) S τ t } i, θ i )S τ t i, θ i ) St i, θ i ) S 2 t i, θ i ) i:i 2 { Sτ τt i, θ i ) S τ τu i, θ i ) St i:i=2 i, θ i ) Su i, θ i ) Sτ t i, θ i ) S τ u i, θ i ) ) S τ t i, θ i ) S τ u i, θ i ) ) } Sti, θ i ) Su i, θ i ) ) 2 + { Sτ τs i, θ i ) Ss i, θ i ) S τ s } i, θ i )S τ s i, θ i ) S 2 s i, θ i ) τ, τ ) = γ, γ), γ, α), α, α). 9
The second order partial derivatives h ττ S ττ are h ττ t, θ) = τ h τt, θ) = g tττ t, θ)h 0 gt, θ)) + g tτ t, θ)g τ t, θ)h 0gt, θ)) + g t t, θ)g θ t, θ)g τ t, θ)h 0gt, θ)) + g t t, θ)g θθ t, θ)h 0gt, θ)) + g tτ t, θ)g θ t, θ)h 0gt, θ)) = h 0 gt, θ))g tττ t, θ) + h 0gt, θ)) { g t t, θ)g θθ t, θ) + g tτ t, θ)g τ t, θ) + g tτ t, θ)g τ t, θ) } + h 0gt, θ))g t t, θ)g θ t, θ)g τ t, θ),, τ, τ ) = γ, γ), γ, λ), λ, λ), 16) from 10), S ττ t, θ) = τ S τt; θ) = { ) )} g ττ t, θ)f 0 gt, θ) + gτ t, θ)g τ t, θ)f 0 gt, θ), τ, τ ) = γ, γ), γ, λ), λ, λ). 17) A Some partial derivatives The function see 11)) gt, γ, α)) = has the following partial derivatives: ) expγ) t, t 0; < γ, α <. 18) expα) ) expγ) g t t, γ, α) = g t, γ, α) ), t > 0 ) t ) { )} g γ t, γ, α) = g t, γ, α) log g t, γ, α) ) ) g α t, γ, α) = expγ)g t, γ, α) 10
) ) expγ) ) g tγ t, γ, α) = gt t, γ, α) + g γ t, γ, α), t > 0 ) t ) g tα t, γ, α) = expγ)gt t, γ, α), t > 0 ) ){ )} t, γ, α) = gγ t, γ, α) 1 + log g t, γ, α) g γ 2 ) ){ )} g γα t, γ, α) = gα t, γ, α) 1 + log g t, γ, α) ) ) t, γ, α) = expγ)gα t, γ, α) g α 2 ) ) g tγ 2 t, γ, α) = gtγ t, γ, α) + expγ) ){ )} g γ t, γ, α) 2 + log g t, γ, α) ) t { ) )} g tγα t, γ, α) = expγ) gt t, γ, α) + gtγ t, γ, α) ) ) t, γ, α) = expγ)gtα t, γ, α) g tα 2 The formulas will be easier to read if we remove all function arguments, i.e., t, γ, α)): g t = expγ) g, t > 0 t g γ = g log g g α = expγ)g g tγ = g t + expγ) g γ, t > 0 t g tα = expγ)g t, t > 0 { } g γ 2 = g γ 1 + log g { } g γα = g α 1 + log g g α 2 = expγ)g α g tγ 2 = g tγ + expγ) { } g γ 2 + log g, t > 0 t g tγα = expγ) { } g t + g tγ, t > 0 g tα 2 = expγ)g tα, t > 0 11