Section 4: Conditional Likelihood: Sufficiency and Ancillarity in the Presence of Nuisance Parameters

Σχετικά έγγραφα
ST5224: Advanced Statistical Theory II

Other Test Constructions: Likelihood Ratio & Bayes Tests

Statistical Inference I Locally most powerful tests

Example Sheet 3 Solutions

C.S. 430 Assignment 6, Sample Solutions

2 Composition. Invertible Mappings

Every set of first-order formulas is equivalent to an independent set

Problem Set 3: Solutions

Uniform Convergence of Fourier Series Michael Taylor

6. MAXIMUM LIKELIHOOD ESTIMATION

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

5.4 The Poisson Distribution.

Lecture 21: Properties and robustness of LSE

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Reminders: linear functions

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Solution Series 9. i=1 x i and i=1 x i.

Theorem 8 Let φ be the most powerful size α test of H

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Section 8.3 Trigonometric Equations

EE512: Error Control Coding

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

Numerical Analysis FMN011

Lecture 2. Soundness and completeness of propositional logic

5. Choice under Uncertainty

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

STAT200C: Hypothesis Testing

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

= λ 1 1 e. = λ 1 =12. has the properties e 1. e 3,V(Y

Solutions to Exercise Sheet 5

4.6 Autoregressive Moving Average Model ARMA(1,1)

Fractional Colorings and Zykov Products of graphs

Μηχανική Μάθηση Hypothesis Testing

Finite Field Problems: Solutions

Mean-Variance Analysis

Math 446 Homework 3 Solutions. (1). (i): Reverse triangle inequality for metrics: Let (X, d) be a metric space and let x, y, z X.

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

Inverse trigonometric functions & General Solution of Trigonometric Equations

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

Lecture 7: Overdispersion in Poisson regression

Congruence Classes of Invertible Matrices of Order 3 over F 2

Exercises to Statistics of Material Fatigue No. 5

Parametrized Surfaces

Homework 8 Model Solution Section

Homework 3 Solutions

6.3 Forecasting ARMA processes

Concrete Mathematics Exercises from 30 September 2016

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

Math221: HW# 1 solutions

Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

Bounding Nonsplitting Enumeration Degrees

Lecture 12: Pseudo likelihood approach

b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

A Two-Sided Laplace Inversion Algorithm with Computable Error Bounds and Its Applications in Financial Engineering

Approximation of distance between locations on earth given by latitude and longitude

The Simply Typed Lambda Calculus

Matrices and Determinants

HW 3 Solutions 1. a) I use the auto.arima R function to search over models using AIC and decide on an ARMA(3,1)

ORDINAL ARITHMETIC JULIAN J. SCHLÖDER

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

Lecture 34 Bootstrap confidence intervals

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

557: MATHEMATICAL STATISTICS II RESULTS FROM CLASSICAL HYPOTHESIS TESTING

Section 7.6 Double and Half Angle Formulas

MINIMAL CLOSED SETS AND MAXIMAL CLOSED SETS

Abstract Storage Devices

HOMEWORK#1. t E(x) = 1 λ = (b) Find the median lifetime of a randomly selected light bulb. Answer:

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

derivation of the Laplacian from rectangular to spherical coordinates

2. Let H 1 and H 2 be Hilbert spaces and let T : H 1 H 2 be a bounded linear operator. Prove that [T (H 1 )] = N (T ). (6p)

The challenges of non-stable predicates

Chapter 3: Ordinal Numbers

Math 6 SL Probability Distributions Practice Test Mark Scheme

MATH423 String Theory Solutions 4. = 0 τ = f(s). (1) dτ ds = dxµ dτ f (s) (2) dτ 2 [f (s)] 2 + dxµ. dτ f (s) (3)

Tridiagonal matrices. Gérard MEURANT. October, 2008

F A S C I C U L I M A T H E M A T I C I

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

An Introduction to Signal Detection and Estimation - Second Edition Chapter II: Selected Solutions

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

Lecture 15 - Root System Axiomatics

A General Note on δ-quasi Monotone and Increasing Sequence

Second Order Partial Differential Equations

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

PARTIAL NOTES for 6.1 Trigonometric Identities

Survival Analysis: One-Sample Problem /Two-Sample Problem/Regression. Lu Tian and Richard Olshen Stanford University

Partial Differential Equations in Biology The boundary element method. March 26, 2013

Exercise 2: The form of the generalized likelihood ratio

Optimal Parameter in Hermitian and Skew-Hermitian Splitting Method for Certain Two-by-Two Block Matrices

Introduction to the ML Estimation of ARMA processes

Section 9.2 Polar Equations and Graphs

Lecture 13 - Root Space Decomposition II

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

Notes on the Open Economy

A Bonus-Malus System as a Markov Set-Chain. Małgorzata Niemiec Warsaw School of Economics Institute of Econometrics

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING

ENGR 691/692 Section 66 (Fall 06): Machine Learning Assigned: August 30 Homework 1: Bayesian Decision Theory (solutions) Due: September 13

Transcript:

70 Section 4: Conditional Likelihood: Sufficiency and Ancillarity in the Presence of Nuisance Parameters In this section we will: (i) Explore decomposing information in terms of conditional and marginal likelihood parts (ii) Give conditions (ancillarities) under which the conditional likelihood is efficient

71 Section 4.1 Partial sufficiency and ancillarity Suppose that X p(x; θ 0 ) P= {p(x; θ) :θ Θ}. We assume θ =(γ,λ),θ=γ Λ, where Γ is an open interval of the real line and Λ is an open set in R k. The parameter of interest is γ and the nuisance parameter is λ. Let T (X) be a statistic. Then, we know that p(x; θ) =h(x T (x);θ)f(t (x);θ) where h is the conditional density of X given T (X) andf is the marginal density of T (X).

72 Partial Sufficiency Definition: If h(x T (x); θ) = h(x T (x); λ) and f(t (x); θ) =f(t (x); γ), i.e., p(x; θ) =h(x T (x);λ)f(t (x);γ) then we say that T (X) ispartially sufficient for γ. The term partial is introduced because we must establish sufficiency for each fixed λ. Inference about γ can be made using only the marginal distribution of T (X). There will be no loss of information. If the score for γ is in the class of unbiased estimating functions, then the score is the most efficient member of the class. see Basu (JASA, 1977, pgs. 355-366).

73 Example 4.1: Partial Sufficiency Suppose that X = X where (1 γ)λ exp(λx) x 0 p(x; θ) = γλexp( λx) x>0 Here Γ = (0, 1) and Λ = R +. Show that T (X) =I(X >0) is partially sufficient for γ. Toseethis,notethat h(x T (x); θ) = (λ exp( λx)) T (x) 1 T (x) (λ exp(λx)) f(t (x); θ) = γ T (x) 1 T (x) (1 γ)

74 Partial Ancillarity Suppose h(x T (x); θ) =h(x T (x); γ) andf(t (x); θ) =f(t (x); θ), i.e., p(x; θ) =h(x T (x);γ)f(t (x);θ) Make inference about γ via the conditional distribution. We will make assumptions about the marginal distribution of T (X) in order to guarantee that there will not be information loss.

75 Section 4.2 Important types of partial ancillarity S-Ancillarity Definition: T (X) issaidtobes-ancillary for γ if f(t (x); θ) depends only on λ, i.e., p(x; θ) =h(x T (x);γ)f(t (x);λ) This definition is equivalent to letting T (X) be partially sufficient for λ. see Basu (JASA, 1977, pgs. 355-366).

76 R-Ancillarity Definition: T (X) issaidtober-ancillary for γ if there exists a reparameterization between θ =(γ,λ) and(γ,φ) such that f(t (x); θ) depends on θ only through φ. Thatis, p(x; θ) =h(x T (x);γ)f(t (x);φ) see Basawa (Biometrika, 1981, pgs. 153-164).

77 C-Ancillarity Definition: T (X) issaidtobec-ancillary for γ if for all γ Γ, the class {f(t (x); γ,λ):λ Λ} is complete. Completeness: E θ [m(t (X); γ)] = 0 for all λ Λ implies P θ [m(t (X); γ) =0]=1forallλ Λ see Godambe (Biometrika, 1976, pgs. 277-284).

78 A (Weak)-Ancillarity Definition: T (X) issaidtobea-ancillary if for any given θ 0 Θ and any other γ Γ, there exists a λ = λ(γ,θ 0 ) such that If this condition holds, then f(t; θ 0 )=f(t; γ,λ) for all t {f(t (x); γ,λ); λ Λ} isthesamewhateverbethevalueofγ. Intuitively, observation of T (X) cannot give us any information about γ when λ is unknown. see Andersen (JRSS-B, 1970, pgs. 283-301).

79 Example 4.2: Partial Ancillarity Let X =(Y,Z), where Y and Z are independent normal random variables with variance 1 and means γ and γ + λ, respectively. Here Γ=Λ=R. Let T (X) =Z. Then, we know that Y T (X) N(γ,1) and T (X) N(γ + λ, 1). T (X) isnot S-ancillary for γ. Let φ = γ + λ, sothatt (X) N(φ, 1) Then, T (X) is R-ancillary for γ. For fixed γ, weseethat f(t; θ) = (2π) 1/2 exp( 1 2 (t2 2γt 2λt +(γ + λ) 2 )) = (2π) 1/2 exp( 1 2 (t2 2γt)) exp( 1 2 (γ + λ)2 )exp(λt)

80 By exponential family results, we know that for fixed γ {f(t; γ,λ):λ R} is complete. Thus, we know that T (X) is C-ancillary. For fixed θ 0 and given γ, we define λ = λ(γ,θ 0 )= γ + γ 0 + λ 0. Then for all t, we know that So, T (X) is A-ancillary. f(t; θ 0 )=f(t; γ,λ)

81 Section 4.3 Information decomposition Information Decomposition Suppose h(x T (x); θ) =h(x T (x); γ) andf(t (x); θ) =f(t (x); θ), i.e., p(x; θ) =h(x T (x);γ)f(t (x);θ) Recall: The Fisher information for γ was defined as I γ (θ) = I γγ (θ) I γλ (θ)i λλ (θ) 1 I λγ (θ) = E θ [(ψ γ (X; θ) Π[ψ γ (X; θ) Λ θ ]) 2 ] The generalized Fisher information for γ from a class of unbiased estimating function G was defined as I γ(θ) =E θ [(ψ γ (X; θ) Π[ψ γ (X; θ) G γ ]) 2 ]

82 Let ψγ C (X; γ) andψγ M (X; θ) bethescoresforγ based on the conditional and marginal parts of the factorization of the density of X. Letψ λ (X; θ) =ψλ M (X; θ) be the score for λ from the marginal part of the factorization. Note that the nuisance tangent space Λ θ is the same as the nuisance tangent space from the marginal density of T (X). The Fisher information for γ contained in the conditional distribution h is I C γ (θ) =E θ [ψ C γ (X; γ) 2 ] The generalized Fisher information for γ from a class of unbiased estimating function G and based on the conditional distribution h is I C γ (θ) =E θ [(ψ C γ (X; γ) Π[ψ C γ (X; γ) G γ ) 2 ]

83 If ψγ C (X; γ) G γ,theniγ C (θ) =Iγ C (θ). The Fisher information for γ contained in the marginal distribution f is I M γ (θ) =E θ [(ψ M γ (X; θ) Π[ψ M γ (X; θ) Λ θ ]) 2 ] The generalized Fisher information for γ from a class of unbiased estimating function G based on the marginal distribution f is I M γ (θ) =E θ [(ψ M γ (X; θ) Π[ψ M γ (X; θ) G γ ]) 2 ]

84 Theorem 4.1: If ψ C γ (X; γ) G γ, I γ(θ) =I C γ Proof: (θ)+iγ M (θ). I γ(θ) = E θ [(ψ γ (X; θ) Π[ψ γ (X; θ) G γ ]) 2 ] = E θ [(ψ C γ (X; γ)+ψ M γ (X; θ) Π[ψ M γ (X; θ) G γ ]) 2 ] = I C γ (θ)+iγ M (θ)+ 2E θ [ψ C γ (X; γ)(ψ M γ (X; θ) Π[ψ M γ (X; θ) G γ ])] = Iγ C (θ)+iγ M (θ)+2e θ [ψγ C (X; γ)ψγ M (X; θ)] = I C γ (θ)+iγ M (θ)

85 Theorem 4.2: I γ (θ) =I C γ (θ)+i M γ (θ). Proof: Note that ψ C γ (X; γ) Λ θ. I γ (θ) = E θ [(ψ γ (X; θ) Π[ψ γ (X; θ) Λ θ ]) 2 ] = E θ [(ψγ C (X; γ)+ψγ M (X; θ) Π[ψγ M (X; θ) Λ θ ]) 2 ] = Iγ C (θ)+iγ M (θ)+ 2E θ [ψγ C (X; γ)(ψγ M (X; θ) Π[ψγ M (X; θ) Λ θ ])] = Iγ C (θ)+iγ M (θ)+2e θ [ψγ C (X; γ)ψγ M (X; θ)] = Iγ C (θ)+iγ M (θ) The information decompositions clearly spell out how the information about γ is partitioned between the conditional and marginal parts of the factorization.

86 Section 4.4 Optimality of the conditional score under ancillarities What is the efficiency of ψ C γ (X; γ)? Assume that ψ C γ (X; γ) G γ. Eff θ [ψ C γ (X; γ)] E θ[ ψ C γ (X; γ)/ γ] 2 E θ [ψ C γ (X; γ) 2 ] = {E θ[ 2 log h(x T (X); γ)/ γ 2 ]} 2 E θ [( log h(x T (X); γ)/ γ) 2 ] = E θ [ψγ C (X; γ) 2 ]=Iγ C (θ) =Iγ C (θ) Now, we say an UEF g 0 is optimal for γ if Eff θ [g 0 ]=Iγ(θ). The conditional score function will be optimal if Iγ M (θ) = 0. We will now give conditions under which this latter condition holds.

87 Lemma 4.3: If T (X) is S-ancillary, then the conditional score function is optimal. Proof: I M γ (θ) =E θ [(ψ M γ (X; θ) Π[ψ M γ (X; θ) G γ ]) 2 ]=0 since ψ M γ (X; θ) =0

88 Lemma 4.4: If T (X) is R-ancillary, then the conditional score function is optimal. Proof: It is sufficient to show that Iγ M (θ) =0sinceweknowthat 0 Iγ M (θ) Iγ M (θ). Since T (X) is R-ancillary, we know that there exists a one-to-one reparameterization between θ =(γ,λ) and (γ,φ) such that f(t (x); θ) depends on θ only through φ. Thatis, f(t; θ) =f (t; φ(θ)) Under suitable regularity conditions, we know that ψ M γ (θ) = log f (T (X); φ(θ))/ φ φ(θ)/ γ ψ M λ (θ) = log f (T (X); φ(θ))/ φ φ(θ)/ λ Assuming that φ(θ)/ λ is positive definite, we know that log f (T (X); φ(θ))/ φ = ψ M λ (θ) [ φ(θ)/ λ] 1

89 This implies that ψ M γ (θ) = ψ M λ (θ) [ φ(θ)/ λ] 1 φ(θ)/ γ = φ(θ)/ γ [ φ(θ)/ λ] 1 ψ M λ (θ) Note that ψγ M (θ) Λ θ.thisimpliesthatπ[ψγ M (θ) Λ θ ]=ψγ M (θ). So, Iγ M (θ) =E θ [(ψγ M (θ) Π[ψγ M (θ) Λ θ ]) 2 ]=0

90 Lemma 4.5: If T (X) is C-ancillary, then the conditional score function is optimal. Proof: It suffices to show that ψ M γ (X; θ) G γ.sincet (X) is C-ancillary, we know that for all γ Γ, the family {f(t; γ,λ):λ Λ} is complete. That is, E θ [m(t (X); γ)] = 0 for all λ Λ P θ [m(t (X); γ) =0]=1forallλ Λ Now, for an UEF g(x; γ), E θ [ψ M γ (X; θ)g(x; γ)] = E θ [ψ M γ (X; θ)e θ [g(x; γ) T (X)]] Note that E θ [g(x; γ) T (X)] is a function of T (X) andγ with mean zero. This implies that E θ [g(x; γ) T (X)] = 0 a.e. So, E θ [ψ M γ (X; θ)g(x; γ)] = 0 which implies that ψ M γ (X; θ) G γ.

91 Lemma 4.6: If T (X) is A-ancillary, then it is R-ancillary. Proof: If T (X) is A-ancillary, then for any given θ 0 Θandany other γ Γ, there exists a λ = λ(γ,θ 0 ) such that f(t; θ 0 )=f(t; γ,λ(γ,θ 0 )) for all t For any γ, we know that the distribution of T (X) depends only on φ = λ(γ,θ 0 ). So, there is a transformation between θ 0 and (γ,φ) such that f(t (x); θ) depends only on φ. Thatis,T (X) is R-ancillary. Corollary 4.7: If T (X) is A-ancillary, then the conditional score function is optimal. Proof: If T (X) is A-ancillary, then it is R-ancillary. R-ancillarity implies that the conditional score is optimal.

92 Four Examples of Ancillarity Example 4.3: Let X =(Y 1,Y 2 ) be independent Poisson random variables with means µ 1 and µ 2.Letγ = µ 1 µ 1 +µ 2 and λ = µ 1 + µ 2. Let T (X) =Y 1 + Y 2. Show that T (X) is S-ancillary for γ. We know that T (X) P oisson(λ). The conditional distribution of X given T (X) isequalto

93 h(x T (x); θ) = P [Y 1 = y 1,Y 2 = y 2,T(X) =t] P [T (X) =t] = P [Y 1 = y 1,Y 2 = t y 1 ]I(y 1 t) P [Y 1 + Y 2 = t] = exp( µ 1)µ y 1 1 exp( µ 2)µ t y 1 2 I(y 1 t)/y 1!(t y 1 )! exp( µ 1 µ 2 )(µ 1 + µ 2 ) t /t! = = y1 t y1 µ1 µ2 µ 1 + µ 2 µ 1 + µ 2 t! y 1!(t y 1 )! γy 1 (1 γ) t y 1 I(y 1 t) t! y 1!(t y 1 )! I(y 1 t) So, p(x; θ) =h(x T (x);γ)f(t (x);λ) and T (X) is S-ancillary.

94 Example 4.4: Let X =(X 1,X 2,...,X n ) be i.i.d. Normal random variables with mean µ and variance σ 2.Letγ = σ 2 and λ = µ. Let T (X) =X. We know that X N(λ, γ/n). By exponential family results, we know that for fixed γ, T (X) is complete for λ. The conditional distribution of X given T (X) isequalto h(x T (X) =t; θ) = (2πγ) n/2 exp( P n i=1 (x i λ) 2 /(2γ))I( P n i=1 x i = nt) ( n 2πγ )1/2 exp( n(t λ) 2 /(2γ)) = (2πγ) (n 1)/2 n 1/2 exp( ( nx i=1 x 2 i nt 2 )/(2γ)) So, p(x; θ) =h(x T (x);γ)f(t (x);θ) and T (X) is C-ancillary.

Example 4.5: Suppose that X 1 Binomial(n 1,p 1 )and X 2 Binomial(n 2,p 2 )wheren 1 and n 2 are fixed sample sizes. We assume X 1 is independent of X 2.LetX =(X 1,X 2 ), q 1 =1 p 1, q 2 =1 p 2, γ =log( p 2/q 2 p 1 /q 1 )andλ =log(p 1 /q 1 ). There is a one-to-one mapping between (p 1,p 2 )and(γ,λ). Let T (X) =X 1 + X 2. What is the distribution of T (X)? 95

96 f(t; θ) = = min(t,n 2 ) X u=0 min(t,n 2 ) X u=max(0,t n 1 ) P [X 1 + X 2 = t X 2 = u]p [X 2 = u] P [X 1 = t u]p [X 2 = u] = min(t,n 2 ) X u=max(0,t n 1 ) t u 0 n 1 @ 1 0 A p t u 1 q n 1 t+u n 2 1 @ u 1 A p u 2 q n 2 u 2 = min(t,n 2 ) X u=max(0,t n 1 ) t u 0 n 1 @ u 1 n 0 2 @ A 1 A exp(γu + λt)q n 1 1 q n 2 2 T (X) is C-ancillary, so the conditional score function is optimal. T (X) is not S (obvious), R, A- ancillary.

97 To be thorough, we compute the conditional density of X given T (X). This is given by h(x T (X) =t; θ) = P [X 1 = x 1,X 2 = x 2,T(X) =t] P [T (X) =t] = = x 1 0 n 1 @ x 2 1 n 0 2 @ A 0 P min(t,n2 ) n @ 1 u=max(0,t n 1 ) t u 0 n 1 @ x 1 1 A 0 n 2 @ 0 P min(t,n2 ) n @ 1 u=max(0,t n 1 ) t u 1 A exp(γx2 + λt)q n 1 1 qn 2 2 u 1 n 0 2 @ A 1 A exp(γx2 ) x 2 u 1 n 0 2 @ A 1 A exp(γu + λt)q n 1 1 qn 2 2 1 A exp(γu)

98 One advantage of using the conditional score function is that we may consider situations where there is an infinite-dimensional nuisance parameter (i.e., a semiparametric model). Suppose that there are n independent tables. Let λ i be the baseline log odds for the ith table, but let γ be the common odds ratio across the n tables. Assume that λ i L. Inthiscase,the parameters are (γ,l). Let X =(X 1,...,X n ), T (X) =(T 1 (X 1 ),...,T n (X n )), T i (X i )=X 1i + X 2i, X i =(X 1i,X 2i ), and X 1i and X 2i be independent Binomial random variables with fixed sample sizes n 1i and n 2i and random success probabilities p 1i and p 2i, respectively. Let q 1i =1 p 1i and q 2i =1 p 2i. So, we know that λ i =log(p 1i /q 1i )and γ =log( p 2i/q 2i p 1i /q 1i ) for all i. The conditional distribution of X given T (X) doesn t depend on L.

Example 4.6: Consider the semiparametric truncation model. Let Y and T be independent non-negative random variables. Suppose that Y k(k) andt l(l). In the right truncation problem, (Y,T ) is only observed if Y T. Lagakos (Biometrika, 1988) describes a study population of subjects infected with the HIV virus from contaminated blood transfusions. The date of infection was known for all subjects and only those subjects who contracted AIDS by a fixed date were included in the analysis. Interest is in the incubation time of AIDS, i.e., time from infection to AIDS. Let Y denote the incubation time and T be the time from infection to the fixed cutoff date. Note that Y and T are only observed if Y T. 99

100 Let X =(Y,T) betheobserved(y,t ). Then, p(x) =P [Y = y, T = t] =P [Y = y, T = t Y T ]= k(y)l(t)i(y t) β where P [Y T ]=β = 0 K(t)l(t)dt = 0 (1 L(y))k(y)dy Let T (X) =T be the conditional statistic. Then h(x T (x) =t) =P [Y = y T = t] =P [Y = y T = t, Y T ]= k(y)i(y t) K(t) and f(t) =P [T = t] =P [T = t Y T ]= K(t)l(t) β

101 Suppose that we parameterize the law of Y via a parameter γ and leave the distribution of T unspecified. So, we have a semiparametric model with parameters (γ,l). So, p(x; γ,l)= k γ(y)i(y t) K γ (t) K γ (t)l(t) β γ,l where β γ,l = 0 K γ (t)l(t)dt = 0 (1 L(y))k γ (y)dy

102 Claim 4.8: T (X) is A-ancillary for γ. Proof: For any given (γ 0,l 0 )andanygivenγ, thereexists l = l(γ,γ 0,l 0 ) such that f(t; γ 0,l 0 )=f(t; γ,l(t; γ,γ 0,l 0 ) for all t, where f(t; γ,l)= K γ(t)l(t) β γ,l If we take where l(t; γ,γ 0,l 0 )= K γ 0 (t)l 0 (t) K γ (t)c c = 0 K γ0 (t)l 0 (t) dt K γ (t) then the above equality holds. This implies that the conditional score function is optimal.