Efficient Semiparametric Estimators via Preliminary Estimators for Kullback-Leibler Minimizers

Σχετικά έγγραφα
SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

6. MAXIMUM LIKELIHOOD ESTIMATION

2 Composition. Invertible Mappings

Example Sheet 3 Solutions

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Second Order Partial Differential Equations

D Alembert s Solution to the Wave Equation

Other Test Constructions: Likelihood Ratio & Bayes Tests

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

6.3 Forecasting ARMA processes

Statistical Inference I Locally most powerful tests

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

ST5224: Advanced Statistical Theory II

Lecture 12: Pseudo likelihood approach

( y) Partial Differential Equations

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

derivation of the Laplacian from rectangular to spherical coordinates

Lecture 21: Properties and robustness of LSE

Homework 8 Model Solution Section

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Exercises to Statistics of Material Fatigue No. 5

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

5.4 The Poisson Distribution.

Homework 3 Solutions

Solution Series 9. i=1 x i and i=1 x i.

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

The Simply Typed Lambda Calculus

Partial Differential Equations in Biology The boundary element method. March 26, 2013

C.S. 430 Assignment 6, Sample Solutions

Asymptotic distribution of MLE

Lecture 7: Overdispersion in Poisson regression

Section 8.3 Trigonometric Equations

Matrices and Determinants

Homework for 1/27 Due 2/5

Inverse trigonometric functions & General Solution of Trigonometric Equations

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

P AND P. P : actual probability. P : risk neutral probability. Realtionship: mutual absolute continuity P P. For example:

Εργαστήριο Ανάπτυξης Εφαρμογών Βάσεων Δεδομένων. Εξάμηνο 7 ο

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

Math 446 Homework 3 Solutions. (1). (i): Reverse triangle inequality for metrics: Let (X, d) be a metric space and let x, y, z X.

Parametrized Surfaces

Areas and Lengths in Polar Coordinates

Queensland University of Technology Transport Data Analysis and Modeling Methodologies

Section 8.2 Graphs of Polar Equations

A Note on Intuitionistic Fuzzy. Equivalence Relation

Divergence for log concave functions

Concrete Mathematics Exercises from 30 September 2016

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Abstract Storage Devices

Every set of first-order formulas is equivalent to an independent set

Higher Derivative Gravity Theories

Numerical Analysis FMN011

Areas and Lengths in Polar Coordinates

DiracDelta. Notations. Primary definition. Specific values. General characteristics. Traditional name. Traditional notation

4.6 Autoregressive Moving Average Model ARMA(1,1)

Uniform Convergence of Fourier Series Michael Taylor

The ε-pseudospectrum of a Matrix

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Module 5. February 14, h 0min

forms This gives Remark 1. How to remember the above formulas: Substituting these into the equation we obtain with

Math 6 SL Probability Distributions Practice Test Mark Scheme

2. THEORY OF EQUATIONS. PREVIOUS EAMCET Bits.

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

Heisenberg Uniqueness pairs

Reminders: linear functions

Srednicki Chapter 55

2. Let H 1 and H 2 be Hilbert spaces and let T : H 1 H 2 be a bounded linear operator. Prove that [T (H 1 )] = N (T ). (6p)

Paper Reference. Paper Reference(s) 6665/01 Edexcel GCE Core Mathematics C3 Advanced. Thursday 11 June 2009 Morning Time: 1 hour 30 minutes

Lecture 34 Bootstrap confidence intervals

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

Υπολογιστική Φυσική Στοιχειωδών Σωματιδίων

Problem Set 3: Solutions

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013

Theorem 8 Let φ be the most powerful size α test of H

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

Lecture 2. Soundness and completeness of propositional logic

Introduction to the ML Estimation of ARMA processes

Statistics 104: Quantitative Methods for Economics Formula and Theorem Review

Survival Analysis: One-Sample Problem /Two-Sample Problem/Regression. Lu Tian and Richard Olshen Stanford University

On the summability of divergent power series solutions for certain first-order linear PDEs Masaki HIBINO (Meijo University)

CHAPTER 101 FOURIER SERIES FOR PERIODIC FUNCTIONS OF PERIOD

= λ 1 1 e. = λ 1 =12. has the properties e 1. e 3,V(Y

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

Overview. Transition Semantics. Configurations and the transition relation. Executions and computation

ΗΜΥ 220: ΣΗΜΑΤΑ ΚΑΙ ΣΥΣΤΗΜΑΤΑ Ι Ακαδημαϊκό έτος Εαρινό Εξάμηνο Κατ οίκον εργασία αρ. 2

Parametric proportional hazards and accelerated failure time models

Differential equations

Test Data Management in Practice

Section 9.2 Polar Equations and Graphs

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

EE512: Error Control Coding

Mock Exam 7. 1 Hong Kong Educational Publishing Company. Section A 1. Reference: HKDSE Math M Q2 (a) (1 + kx) n 1M + 1A = (1) =

1. A fully continuous 20-payment years, 30-year term life insurance of 2000 is issued to (35). You are given n A 1

Appendix to On the stability of a compressible axisymmetric rotating flow in a pipe. By Z. Rusak & J. H. Lee

TMA4115 Matematikk 3

Finite Field Problems: Solutions

Transcript:

IISA Talk, DeKalb, June 02 Efficient Semiparametric Estimators via Preliminary Estimators for Kullback-Leibler Minimizers Eric Slud, Math Dept, Univ of Maryland (Ongoing joint project with Ilia Vonta, Univ. of Cyprus.) OUTLINE I. MOTIVATION Semiparametric Frailty Problems II. APPROACH Finite-Dimensional Version III. BACKGROUND LITERATURE Profile Likelihood, Semiparametrics, Frailty Models IV. Semiparametric Examples V. Efficient Estimator in Frailty Case 1

Survival Models with Frailties Variables: T i Survival times, Discrete Covariates Z i C i Censoring Times, cond surv fcn R z (c) given (T i, C i ) DATA: iid triples (min(t i, C i ), I [ T i C i ], Z i ) Observable processes: N i z(t) = I [Zi =z, T i min(c i,t)], Y i z (t) = I [min(ti, C i ) t] TRANSF. MODEL: S T Z (t z) = exp( G(e β z Λ)) G known, β R m, Λ cumulative-hazard fcn PROBLEM: efficient estimation of β. Special Cases: (1) Cox 1972: G(x) x (2) Frailty: unobserved random intercept β 0 = ξ i, G x = G(x) = log 0 e sx df (s) (3) Clayton-Cuzick 1986: G(x) 1 b log(1 + bx) 2

Finite-dimensional Case X i, i = 1,..., n iid f(x, β, λ), β R m, λ R d unknown, with true values (β 0, λ 0 ) loglik(β, λ) = n i=1 log f(x i, β, λ) Profile Likelihood = loglik(β, ˆλ β ) with restricted MLE ˆλβ = arg max λ loglik(β, λ) Min Kullback-Leibler Modified Profile Approach (Severini and Wong 1992) K(β, λ) E β0,λ 0 ( log f(x 1, β, λ)) = {log f(x, β, λ)} f(x, β 0, λ 0 ) dx Define: λ β = arg max λ K(β, λ) Then: λβ estimates curve λ β Candidate Estimator β arg max β loglik(β, λ β ) 3

Key mathematical features of this approach are the convenience of restricting attention to nuisance parameters such as hazards or density functions which satisfy smoothness restrictions; the replacement of operator-inversion within (blocks of) the generalized information operator by differentiation of the restricted Kullback-Leibler minimizer; and diminished need for high-order consistency of estimation, when consistent estimators of Kullback-Leibler minimizers and their derivatives with respect to structural parameters are available. NB: Kullback-Leibler Functional = K(β, λ) 4

Sketch of (Finite-dimensional) Theory Fix β 0, λ 0 and f 0 (x) = f(x, β 0, λ 0 ). Information Matrix : I(β, λ) = = 2 A β,λ Bβ,λ β log f(x, β, λ) 2 βλ log f(x, β, λ) 2 λβ log f(x, β, λ) 2 λ log f(x, β, λ) B β,λ C β,λ f 0 (x)dx λ β = arg max λ K(β, λ) satisfies: λ K(β, λ β ) = 0 Differentiate implicitly (total deriv) wrt β to find: T β [ λ K(β, λ β ))] = B + C β λ β = 0 This implies β λ β = C 1 B and (β, λ β ) is a least-favorable nuisance-parameterization and usual Information about β for this model is I 0 β = A β0,λ 0 ( β λ β ) C β0,λ 0 ( β λ β ) = A β0,λ 0 B β0,λ 0 C 1 β 0,λ 0 B β 0,λ 0 5

Theory, continued Information Ineq says: λ β0 = λ 0, β K(β 0, λ 0 ) = 0 Now assume λ β, 2 β β consistent for λ β, 2 β λ β unif on nbhd of β 0. Can check successively: (A) ( T β ) 2 loglik(β, λ β ) neg-def unif on β 0 nbhd (B) n 1 loglik(β 0, λ β0 ) P 0 (C) β unique loc sol n of T β loglik(β, λ β ) = 0, consistent for β 0. (D) n 1 λ loglik( β, λ β) = n 1 λ loglik(β 0, λ 0 ) + o P ( β β 0 ) [because C 1 B = β λβ ] (E) n 1 β loglik( β, λ β) = n 1 β loglik(β 0, λ 0 ) I 0 β ( β β 0 ) + o P ( β β 0 ) (F) (G) n ( β β0 ) = (I 0 β) 1 1 n ( β loglik(β 0, λ 0 ) + λ loglik(β 0, λ 0 ) β λ β0 ) n ( β β0 ) D N (0, (I 0 β) 1 ) 6

LITERATURE Profile Likelihood Cox, D. R. & Reid, N. (1987) JRSSB McCullagh, P. and Tibshirani, R. (1990) JRSSB Severini, T. and Wong, W. (1992) Ann Stat Semiparametrics Bickel, Klaassen, Ritov, & Wellner: 1993 Book Cox, D. R. (1972) Cox-Model paper JRSSB Owen, A. (1988) Biometrika: Empirical likelihood Qin, J. & Lawless, J. (1994) Ann Stat: EL & GEE Murphy & van der Vaart (2000) JASA Transformation/Frailty Models Cheng, Wei, & Ying (1995) Biometrika Clayton and Cuzick (1986) ISI Centenary Session Slud, E. and Vonta, I. (2002) submitted Parner (1998) Ann. Stat. 7

-dim Examples & Applications (I). Mean estimation in the location model X i λ 0 (x β 0 ), β 0 = E(X 1 ) λ 0 L 1 (dx) compactly supported 0-mean density. (Easy example, X efficient. Owen 1988, Qin & Lawless 1994 studied in connection with empirical likelihood.) For β β 0, λ β (x) = λ 0 (x + β)/(1 αx) with α solving x λ β (x) dx = 0 Define λ β λ β0 (x) = 1 n h n first at β 0 using density estimator n i=1 λ β (x) = λ β0 (x + β) 1 αx φ((x X i + X)/h n ), h n 0, x λ β0 (x + β) 1 αx dx = 0 (II). Cox model. For q z (t) = p Z (z)r z (t) exp( e z β 0 Λ 0 (t)), Λ β (t) t 0 λ β(s) ds = z q z (x) e z β 0 λ 0 (x) z e z β q z (x) 8

Example (II), cont d. First let λ 0 be consistent density estimate of λ 0 (x) = Λ 0(x) (eg by smoothing and differentiating the Kaplan- Meier cumulative-hazard estimator on data in a z = 0 data-stratum.) Then estimate elsewhere by λ β (t) = z ez β 0 Y z (t) λ β0 (t) / z ez β Y z (t) NB. In this example, any β estimator produced in this way collapses to the usual Cox Max Partial Likelihood Estimator! (III). Transformation/Frailty Models In the general G transformation model case, must assume for some finite time τ 0 with Λ 0 (tau 0 ) < that all data are censored at τ 0. In this model, Slud and Vonta (2002) characterize the K-optimizing hazard intensity λ β in its integrated form L = Λ β = 0 λ β (x) dx, through the second order ODE system: 9

Example (III), cont d. dl dλ 0 (s) = z e z β 0 q z (s) G (e z β 0 Λ 0 (s)) z e z β q z (s) G (e z β L(s)) + Q(s) dq dλ 0 (s) = z ez β q z (s) G G e z β L(s) (e z β 0 G (e z β 0 Λ 0 (s)) (e z β G ((e z β L(s)) dl dλ 0 (s)) subject to the initial/terminal conditions L(0) = 0, Q(τ 0 ) = 0 Slud and Vonta (2002) show that these ODE s have unique solutions, smooth with respect to β and differentiable in t, which (with λ β L ) maximize the functional J (β, λ β ) as desired. Consistent preliminary estimators λβ can be developed by substitute for β 0, Λ 0 in those equations (smoothed wrt t) consistent preliminary estimators. Punchline: the new estimator β = arg max β loglik(β, λ β ) is efficient! 10