Parametric proportional hazards and accelerated failure time models

Σχετικά έγγραφα
SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

An Inventory of Continuous Distributions

Homework 3 Solutions

2 Composition. Invertible Mappings

Section 8.3 Trigonometric Equations

4.6 Autoregressive Moving Average Model ARMA(1,1)

Exercises to Statistics of Material Fatigue No. 5

Approximation of distance between locations on earth given by latitude and longitude

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

derivation of the Laplacian from rectangular to spherical coordinates

The Simply Typed Lambda Calculus

ST5224: Advanced Statistical Theory II

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

Areas and Lengths in Polar Coordinates

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

C.S. 430 Assignment 6, Sample Solutions

Concrete Mathematics Exercises from 30 September 2016

HOMEWORK#1. t E(x) = 1 λ = (b) Find the median lifetime of a randomly selected light bulb. Answer:

Statistical Inference I Locally most powerful tests

Example Sheet 3 Solutions

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

6.3 Forecasting ARMA processes

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

6. MAXIMUM LIKELIHOOD ESTIMATION

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

Homework 8 Model Solution Section

Second Order Partial Differential Equations

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Other Test Constructions: Likelihood Ratio & Bayes Tests

Areas and Lengths in Polar Coordinates

Matrices and Determinants

Solution Series 9. i=1 x i and i=1 x i.

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

Math221: HW# 1 solutions

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

Every set of first-order formulas is equivalent to an independent set

Partial Differential Equations in Biology The boundary element method. March 26, 2013

PARTIAL NOTES for 6.1 Trigonometric Identities

Notes on the Open Economy

2. THEORY OF EQUATIONS. PREVIOUS EAMCET Bits.

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Solutions to Exercise Sheet 5

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

Partial Trace and Partial Transpose

forms This gives Remark 1. How to remember the above formulas: Substituting these into the equation we obtain with

w o = R 1 p. (1) R = p =. = 1

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

CRASH COURSE IN PRECALCULUS

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

Ulf Schepsmeier and. Jakob Stöber

Problem Set 3: Solutions

Inverse trigonometric functions & General Solution of Trigonometric Equations

Differential equations

Second Order RLC Filters

5.4 The Poisson Distribution.

EE512: Error Control Coding

Section 9.2 Polar Equations and Graphs

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

Section 7.6 Double and Half Angle Formulas

Problem Set 9 Solutions. θ + 1. θ 2 + cotθ ( ) sinθ e iφ is an eigenfunction of the ˆ L 2 operator. / θ 2. φ 2. sin 2 θ φ 2. ( ) = e iφ. = e iφ cosθ.

EE101: Resonance in RLC circuits

Finite Field Problems: Solutions

Srednicki Chapter 55

DERIVATION OF MILES EQUATION FOR AN APPLIED FORCE Revision C

Part III - Pricing A Down-And-Out Call Option

Congruence Classes of Invertible Matrices of Order 3 over F 2

Lecture 34 Bootstrap confidence intervals

HW 3 Solutions 1. a) I use the auto.arima R function to search over models using AIC and decide on an ARMA(3,1)

Figure A.2: MPC and MPCP Age Profiles (estimating ρ, ρ = 2, φ = 0.03)..

Math 6 SL Probability Distributions Practice Test Mark Scheme

Derivation of Optical-Bloch Equations

Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013

Example of the Baum-Welch Algorithm

Strain gauge and rosettes

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

9.09. # 1. Area inside the oval limaçon r = cos θ. To graph, start with θ = 0 so r = 6. Compute dr

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Reminders: linear functions

Εργαστήριο Ανάπτυξης Εφαρμογών Βάσεων Δεδομένων. Εξάμηνο 7 ο

Lecture 15 - Root System Axiomatics

Οδηγίες Αγοράς Ηλεκτρονικού Βιβλίου Instructions for Buying an ebook

b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!

If we restrict the domain of y = sin x to [ π, π ], the restrict function. y = sin x, π 2 x π 2

Queensland University of Technology Transport Data Analysis and Modeling Methodologies

Assalamu `alaikum wr. wb.

( y) Partial Differential Equations

Higher Derivative Gravity Theories

MATH423 String Theory Solutions 4. = 0 τ = f(s). (1) dτ ds = dxµ dτ f (s) (2) dτ 2 [f (s)] 2 + dxµ. dτ f (s) (3)

Variational Wavefunction for the Helium Atom

D Alembert s Solution to the Wave Equation

If we restrict the domain of y = sin x to [ π 2, π 2

Η ΨΥΧΙΑΤΡΙΚΗ - ΨΥΧΟΛΟΓΙΚΗ ΠΡΑΓΜΑΤΟΓΝΩΜΟΣΥΝΗ ΣΤΗΝ ΠΟΙΝΙΚΗ ΔΙΚΗ

CHAPTER 48 APPLICATIONS OF MATRICES AND DETERMINANTS

Parametrized Surfaces

5. Choice under Uncertainty

Galatia SIL Keyboard Information

Transcript:

Parametric proportional hazards accelerated failure time models Göran Broström October 11, 2017 Abstract A unified implementation of parametric proportional hazards PH) accelerated failure time AFT) models for right-censored or intervalcensored left-truncated data is described. The description here is vaĺid for time-constant covariates, but the necessary modifications for hling time-varying covariates are implemented in eha. Note that only piecewise constant time variation is hled. 1 Introduction There is a need for software for analyzing parametric proportional hazards PH) accelerated failure time AFT) data, that are right or interval censored left truncated. 2 The proportional hazards model We define proportional hazards models in terms of an expansion of a given survivor function S 0, s θ t; z) = {S 0 gt, θ))} expzβ), 1) where θ is a parameter vector used in modeling the baseline distribution, β is the vector of regression parameters, g is a positive function, which helps defining a parametric family of baseline survivor functions through St; θ) = S 0 gt, θ) ), t > 0, θ Θ. 2) 1

With f 0 h 0 defined as the density hazard functions corresponding to S 0, respectively, the density function corresponding to S is ft; θ) = St, θ) t = t S 0gt, θ)) = g t t, θ)f 0 gt, θ)), where g t t, θ) = gt, θ). t Correspondingly, the hazard function is ft; θ) ht; θ) = St; θ) = g t t, θ)h 0 gt, θ)). 3) So, the proportional hazards model is λ θ t; z) = ht; θ) expzβ) = g t t, θ)h 0 gt, θ)) expzβ), 4) corresponding to 1). 2.1 Data the likelihood function Given left truncated right or interval censored data s i, t i, u i, d i, z i ), i = 1,..., n the model 4), the likelihood function becomes L θ, β); s, t, u, d), Z ) = n { hti ; θ) expz i β) ) I {di =1} St i ; θ) expz iβ) ) I {di 2} St i ; θ) expz iβ) Su i ; θ) expz iβ) ) I {di =2} Ss i ; θ) expz iβ) } Here, for i = 1,..., n, s i < t i u i are the left truncation exit intervals, respectively, d i = 0 means that t i = u i right censoring at u i, d i = 1 means that t i = u i an event at u i, d i = 2 means that t i < u i an event occurs in the interval t i, u i ) interval censoring) z i = z i1,..., z ip ) is a vector of explanatory variables with corresponding parameter vector β = β 1,..., β p ), i = 1,..., n. 2 5)

From 5) we now get the log likelihood the score vector in a straightforward manner. l θ, β); s, t, u, d), Z ) = { log hti ; θ) + z i β } e z iβ log Su i ; θ) log { St i ; θ) ez i β Su i ; θ) ez i β } e ziβ log Ss i ; θ) in the following we drop the long argument list to l), for the regression parameters β, l = β j e ziβ log St i ; θ) e z iβ St i; θ) eziβ e ziβ log Ss i ; θ), j = 1,..., p, for the baseline parameters θ, in vector form, θ l = h θ t i, θ) ht i, θ) From 3), e z iβ S θt i ; θ) St i ; θ) log St i ; θ) Su i ; θ) ez i β log Su i ; θ) St i ; θ) ez i β Su i ; θ) ez i β e z iβ St i; θ) eziβ 1 S θ t i ; θ) Su i ; θ) eziβ 1 S θ u i ; θ) St i ; θ) ez i β Su i ; θ) ez i β e z iβ S θs i ; θ) Ss i ; θ). h θ t, θ) = ht, θ) θ = g tθ t, θ)h 0 gt, θ)) + g t t, θ)g θ t, θ)h 0gt, θ)), 3 6) 7) 8) 9)

, from 2), S θ t; θ) = St; θ) = θ θ S ) 0 gt, θ) ) 10) = g θ t, θ)f 0 gt, θ). For estimating stard errors, the observed information the negative of the hessian) is useful. However, instead of the error-prone tedious work of calculating analytic second-order derivatives, we will rely on approximations by numerical differentiation. 3 The shape scale families From 1) we get a shape scale family of distributions by choosing θ = p, λ) ) p t gt, p, λ)) =, t 0; p, λ > 0. λ However, for reasons of efficient numerical optimization normality of parameter estimates, we use the parametrisation p = expγ) λ = expα), thus redefining g to gt, γ, α)) = ) expγ) t, t 0; < γ, α <. 11) expα) For the calculation of the score hessian of the log likelihood function, we need some partial derivatives of g. They are found in an appendix. 3.1 The Weibull family of distributions The Weibull family of distributions is obtained by leading to S 0 t) = expt), t 0, f 0 t) = expt), t 0, h 0 t) = 1, t 0. We need some first second order derivatives of f h, they are particularly simple in this case, for h they are both zero, for f we get f 0t) = expt), t 0. 4

3.2 The EV family of distributions The EV Extreme Value) family of distributions is obtained by setting leading to h 0 t) = expt), t 0, S 0 t) = exp{expt) 1)}, t 0, The rest of the necessary functions are easily derived from this. 3.3 The Gompertz family of distributions The Gompertz family of distributions is given by ht) = τ expt/λ), t 0; τ, λ > 0. This family is not directly possible to generate from the described shape-scale models, so it is treated separately by direct maximum likelihood. 3.4 Other families of distributions Included in the eha package are the lognormal the loglogistic distributions as well. 4 The accelerated failure time model In the description of this family of models, we generate a scape-scale family of distributions as defined by the equations 2) 11). We get ) St; γ, α)) = S 0 gt, γ, α)) = S 0 { t expα) } expγ) ), t > 0, < γ, α <. 12) Given a vector z = z 1,..., z p ) of explanatory variables a vector of corresponding regression coefficients β = β 1,..., β p ), the AFT regression model is defined by ) St; γ, α, β)) = S 0 gt expzβ), γ, α)) { } expγ) ) t expzβ) = S 0 expα) { } expγ) ) 13) t = S 0 expα zβ) = S 0 gt, γ, α zβ)) ), t > 0. 5

So, by defining θ = γ, α zβ), we are back in the framework of Section 2. We get ft; θ) = g t t, θ)f 0 gt, θ)) ht; θ) = g t t, θ)h 0 gt, θ)), 14) defining the AFT model generated by the survivor function S 0 corresponding density f 0 hazard h 0. 4.1 Data the likelihood function Given left truncated right or interval censored data s i, t i, u i, d i, z i ), i = 1,..., n the model 14), the likelihood function becomes L γ, α, β); s, t, d), Z ) n { = hti ; θ i ) I {d i =1} St i ; θ i ) I {i 2} St i ; θ i ) Su i ; θ i ) ) I {di =2} Ss i ; θ i ) 1} 15) Here, for i = 1,..., n, s i < t i u i are the left truncation exit intervals, respectively, d i = 0 means that t i = u i right censoring at t i, d i = 1 means that t i = u i an event at t i, d i = 2 means that t i < u i an event uccurs in the interval t i, u i ) interval censoring), z i = z i1,..., z ip ) is a vector of explanatory variables with corresponding parameter vector β = β 1,..., β p ), i = 1,..., n. From 15) we now get the log likelihood the score vector in a straightforward manner. l γ, α, β); s, t, u, d), Z ) = log ht i ; θ i ) log St i ; θ i ) log St i ; θ i ) Sui ; θ i )) log Ss i ; θ i ) in the following we drop the long argument list to l), for the regression parameters β, 6

l = h j t i, θ i ) β j ht i, θ i ) S j t i ; θ i ) St i ; θ i ) d i =1 d i 2 S j t i ; θ i ) S j u i ; θ i ) St i ; θ i ) Su i ; θ i ) S j s i ; θ i ) Ss d i =2 i ; θ i ) = h α t i, θ i ) ht i, θ i ) S α t i ; θ i ) St i ; θ i ) d i =1 d i 2 d i =2 S α t i ; θ i ) S α u i ; θ i ) St i ; θ i ) Su i ; θ i ) for the baseline parameters γ α, Here, from 3), γ l = α l = h γ t, θ i ) = γ ht, θ i) h γ t i, θ i ) ht i, θ i ) S γ t i ; θ i ) St i ; θ i ) + S γ t i ; θ i ) S γ u i ; θ i ) St i ; θ i ) Su i ; θ i ) h α t i, θ i ) ht i, θ i ) S α t i ; θ i ) St i ; θ i ) S α t i ; θ i ) S α u i ; θ i ) St i ; θ i ) Su i ; θ i ) S α s i ; θ i ) Ss i ; θ i ) S γ s i ; θ i ) Ss i ; θ i ), S α s i ; θ i ) Ss i ; θ i ). = g tγ t, θ i )h 0 gt, θ i )) + g t t, θ i )g γ t, θ i )h 0gt, θ i )), h α t, θ i ) = α ht, θ i) = g tα t, θ i )h 0 gt, θ i )) + g t t, θ i )g α t, θ i )h 0gt, θ i )), h j t, θ i ) = β j ht, θ i ) = α ht, θ i) β j α z i β) = h α t, θ i ), j = 1,..., p. 7

Similarly, from 2) we get S γ t; θ i ) = γ St; θ i) = γ S 0 gt, θi ) ) = g γ t, θ i )f 0 gt, θi ) ), S α t; θ i ) = α St; θ i) = α S 0 gt, θi ) ) = g α t, θ i )f 0 gt, θi ) ). S j t; θ i ) = St; θ i ) = β j α S 0 gt, θi ) ) α z i β) β j = S α t, θ i ), j = 1,..., p. For estimating stard errors, the observed information the negative of the hessian) is useful, so 2 l = { ) 2 } hαα t i, θ i ) hα t i, θ i ) z im β j β m ht i, θ i ) ht i, θ i ) { ) 2 } Sαα t i, θ i ) Sα t i, θ i ) z im St i, θ i ) St i, θ i ) i:i 2 { ) 2 } Sαα t i, θ i ) S αα u i, θ i ) Sα t i, θ i ) S α u i, θ i ) z im St i:i=2 i, θ i ) Su i, θ i ) St i, θ i ) Su i, θ i ) { ) 2 } Sαα s i, θ i ) Sα s i, θ i ) + z im, j, m = 1,..., p, Ss i, θ i ) Ss i, θ i ) 8

2 β j τ l = finally i:i 2 { hατ t i, θ i ) ht i, θ i ) h } αt i, θ i )h τ t i, θ i ) h 2 t i, θ i ) { Sατ t i, θ i ) St i, θ i ) S αt i, θ i )S τ t i, θ i ) S 2 t i, θ i ) { Sατ t i, θ i ) S ατ u i, θ i ) St i:i=2 i, θ i ) Su i, θ i ) Sα t i, θ i ) S α u i, θ i ) ) S τ t i, θ i ) S τ u i, θ i ) ) } Sti, θ i ) Su i, θ i ) ) 2 { Sατ s i, θ i ) Ss i, θ i ) S } αs i, θ i )S τ s i, θ i ) S 2 s i, θ i ) } j = 1,..., p; τ = γ, α, 2 ττ l = { hτ τt i, θ i ) h τ t } i, θ i )h τ t i, θ i ) ht i, θ i ) h 2 t i, θ i ) { Sτ τt i, θ i ) S τ t } i, θ i )S τ t i, θ i ) St i, θ i ) S 2 t i, θ i ) i:i 2 { Sτ τt i, θ i ) S τ τu i, θ i ) St i:i=2 i, θ i ) Su i, θ i ) Sτ t i, θ i ) S τ u i, θ i ) ) S τ t i, θ i ) S τ u i, θ i ) ) } Sti, θ i ) Su i, θ i ) ) 2 + { Sτ τs i, θ i ) Ss i, θ i ) S τ s } i, θ i )S τ s i, θ i ) S 2 s i, θ i ) τ, τ ) = γ, γ), γ, α), α, α). 9

The second order partial derivatives h ττ S ττ are h ττ t, θ) = τ h τt, θ) = g tττ t, θ)h 0 gt, θ)) + g tτ t, θ)g τ t, θ)h 0gt, θ)) + g t t, θ)g θ t, θ)g τ t, θ)h 0gt, θ)) + g t t, θ)g θθ t, θ)h 0gt, θ)) + g tτ t, θ)g θ t, θ)h 0gt, θ)) = h 0 gt, θ))g tττ t, θ) + h 0gt, θ)) { g t t, θ)g θθ t, θ) + g tτ t, θ)g τ t, θ) + g tτ t, θ)g τ t, θ) } + h 0gt, θ))g t t, θ)g θ t, θ)g τ t, θ),, τ, τ ) = γ, γ), γ, λ), λ, λ), 16) from 10), S ττ t, θ) = τ S τt; θ) = { ) )} g ττ t, θ)f 0 gt, θ) + gτ t, θ)g τ t, θ)f 0 gt, θ), τ, τ ) = γ, γ), γ, λ), λ, λ). 17) A Some partial derivatives The function see 11)) gt, γ, α)) = has the following partial derivatives: ) expγ) t, t 0; < γ, α <. 18) expα) ) expγ) g t t, γ, α) = g t, γ, α) ), t > 0 ) t ) { )} g γ t, γ, α) = g t, γ, α) log g t, γ, α) ) ) g α t, γ, α) = expγ)g t, γ, α) 10

) ) expγ) ) g tγ t, γ, α) = gt t, γ, α) + g γ t, γ, α), t > 0 ) t ) g tα t, γ, α) = expγ)gt t, γ, α), t > 0 ) ){ )} t, γ, α) = gγ t, γ, α) 1 + log g t, γ, α) g γ 2 ) ){ )} g γα t, γ, α) = gα t, γ, α) 1 + log g t, γ, α) ) ) t, γ, α) = expγ)gα t, γ, α) g α 2 ) ) g tγ 2 t, γ, α) = gtγ t, γ, α) + expγ) ){ )} g γ t, γ, α) 2 + log g t, γ, α) ) t { ) )} g tγα t, γ, α) = expγ) gt t, γ, α) + gtγ t, γ, α) ) ) t, γ, α) = expγ)gtα t, γ, α) g tα 2 The formulas will be easier to read if we remove all function arguments, i.e., t, γ, α)): g t = expγ) g, t > 0 t g γ = g log g g α = expγ)g g tγ = g t + expγ) g γ, t > 0 t g tα = expγ)g t, t > 0 { } g γ 2 = g γ 1 + log g { } g γα = g α 1 + log g g α 2 = expγ)g α g tγ 2 = g tγ + expγ) { } g γ 2 + log g, t > 0 t g tγα = expγ) { } g t + g tγ, t > 0 g tα 2 = expγ)g tα, t > 0 11