Various types of likelihood

Σχετικά έγγραφα
Various types of likelihood

Other Test Constructions: Likelihood Ratio & Bayes Tests

Lecture 34 Bootstrap confidence intervals

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Solution Series 9. i=1 x i and i=1 x i.

Theorem 8 Let φ be the most powerful size α test of H

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Statistical Inference I Locally most powerful tests

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Areas and Lengths in Polar Coordinates

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

5.4 The Poisson Distribution.

Introduction to the ML Estimation of ARMA processes

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

Areas and Lengths in Polar Coordinates

6. MAXIMUM LIKELIHOOD ESTIMATION

Partial Differential Equations in Biology The boundary element method. March 26, 2013

Tutorial on Multinomial Logistic Regression

D Alembert s Solution to the Wave Equation

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

Statistics 104: Quantitative Methods for Economics Formula and Theorem Review

Homework 8 Model Solution Section

Homework 3 Solutions

ST5224: Advanced Statistical Theory II

Lecture 12: Pseudo likelihood approach

Approximation of distance between locations on earth given by latitude and longitude

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Lecture 7: Overdispersion in Poisson regression

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

C.S. 430 Assignment 6, Sample Solutions

The Profile Likelihood

Local Approximation with Kernels

6.3 Forecasting ARMA processes

Overview. Transition Semantics. Configurations and the transition relation. Executions and computation

Abstract Storage Devices

Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013

FORMULAS FOR STATISTICS 1

Supplementary Appendix

More Notes on Testing. Large Sample Properties of the Likelihood Ratio Statistic. Let X i be iid with density f(x, θ). We are interested in testing

derivation of the Laplacian from rectangular to spherical coordinates

Written Examination. Antennas and Propagation (AA ) April 26, 2017.

The Simply Typed Lambda Calculus

Partial Trace and Partial Transpose

4.6 Autoregressive Moving Average Model ARMA(1,1)

2 Composition. Invertible Mappings

Problem Set 3: Solutions

The ε-pseudospectrum of a Matrix

Asymptotic distribution of MLE

Exercises to Statistics of Material Fatigue No. 5

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

Empirical best prediction under area-level Poisson mixed models

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

An Introduction to Signal Detection and Estimation - Second Edition Chapter II: Selected Solutions

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

Survival Analysis: One-Sample Problem /Two-Sample Problem/Regression. Lu Tian and Richard Olshen Stanford University

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

An Inventory of Continuous Distributions

HW 3 Solutions 1. a) I use the auto.arima R function to search over models using AIC and decide on an ARMA(3,1)

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

Space-Time Symmetries

Second Order Partial Differential Equations

Matrices and Determinants


The challenges of non-stable predicates

Probabilistic and Bayesian Machine Learning

Homework for 1/27 Due 2/5

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

Math 6 SL Probability Distributions Practice Test Mark Scheme

Notes on the Open Economy

On the Galois Group of Linear Difference-Differential Equations

Srednicki Chapter 55

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

MATH423 String Theory Solutions 4. = 0 τ = f(s). (1) dτ ds = dxµ dτ f (s) (2) dτ 2 [f (s)] 2 + dxµ. dτ f (s) (3)

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

Numerical Analysis FMN011

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

forms This gives Remark 1. How to remember the above formulas: Substituting these into the equation we obtain with

Second Order RLC Filters

Section 7.6 Double and Half Angle Formulas

Solutions to Exercise Sheet 5

Figure A.2: MPC and MPCP Age Profiles (estimating ρ, ρ = 2, φ = 0.03)..

Risk! " #$%&'() *!'+,'''## -. / # $

Tridiagonal matrices. Gérard MEURANT. October, 2008

ENGR 691/692 Section 66 (Fall 06): Machine Learning Assigned: August 30 Homework 1: Bayesian Decision Theory (solutions) Due: September 13

EE512: Error Control Coding

Reminders: linear functions

( y) Partial Differential Equations

Bounding Nonsplitting Enumeration Degrees

Introduction to Bayesian Statistics

Additional Results for the Pareto/NBD Model

Concrete Mathematics Exercises from 30 September 2016

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

Tests and model choice : asymptotics

Transcript:

Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood, Bayesian asymptotics 2. quasi-likelihood, composite likelihood 3. semi-parametric likelihood, partial likelihood 4. empirical likelihood, penalized likelihood 5. bootstrap likelihood, h-likelihood, weighted likelihood, pseudo-likelihood, local likelihood, sieve likelihood, simulated likelihood STA 4508: Topics in Likelihood Inference January 14, 2014 1/57

Nuisance parameters: notation θ = (ψ, λ) = (ψ 1,..., ψ q, λ 1,..., λ d q ) ( ) Uψ (θ) U(θ) =, U U λ (θ) λ (ψ, ˆλ ψ ) = 0 ( ) ( ) iψψ i i(θ) = ψλ jψψ j j(θ) = ψλ i λψ i λλ ( i i 1 (θ) = ψψ i ψλ ) i λψ i λλ j λψ j λλ ( j j 1 (θ) = ψψ j ψλ ). j λψ i ψψ (θ) = {i ψψ (θ) i ψλ (θ)i 1 λλ (θ)i λψ(θ)} 1, l p (ψ) = l(ψ, ˆλ ψ ), j p (ψ) = l p(ψ) j λλ STA 4508: Topics in Likelihood Inference January 14, 2014 2/57

Nuisance parameters: approximate pivots w u (ψ) = U ψ (ψ, ˆλ ψ ) T {i ψψ (ψ, ˆλ ψ )}U ψ (ψ, ˆλ ψ ). χ 2 q w e (ψ) = ( ˆψ ψ) T {i ψψ ( ˆψ, ˆλ)} 1 ( ˆψ ψ). χ 2 q w(ψ) = 2{l( ˆψ, ˆλ) l(ψ, ˆλ ψ )} = 2{l p ( ˆψ) l p (ψ)}. χ 2 q; r u (ψ) = l p(ψ)j 1/2 p ( ˆψ) r e (ψ) = ( ˆψ ψ)j 1/2 p ( ˆψ). N(0, 1),. N(0, 1), r(ψ) = sign( ˆψ ψ)[2{l p ( ˆψ) l p (ψ)}] 1/2. N(0, 1) STA 4508: Topics in Likelihood Inference January 14, 2014 3/57

Nuisance parameters: properties of likelihood maximum likelihood estimates are equivariant: ĥ(θ) = h(ˆθ) for one-to-one h( ) question: which of w e, w u, w are invariant under reparametrization of the full parameter: ϕ(θ)? question: which of r e, r u, r are invariant under interest-respecting reparameterizations (ψ, λ) {ψ, η(ψ, λ)}? consistency of maximum likelihood estimate equivalence of maximum likelihood estimate and root of score equation observed vs. expected information STA 4508: Topics in Likelihood Inference January 14, 2014 5/57

Various types of likelihood 1. likelihood, marginal likelihood, conditional likelihood, profile likelihood, adjusted profile likelihood 2. quasi-likelihood, composite likelihood 3. semi-parametric likelihood, partial likelihood 4. empirical likelihood, penalized likelihood 5. bootstrap likelihood, h-likelihood, weighted likelihood, pseudo-likelihood, local likelihood, sieve likelihood, simulated likelihood STA 4508: Topics in Likelihood Inference January 14, 2014 7/57

Marginal and conditional likelihoods Example: Y N(Xβ, σ 2 ), Y R n Example: Y ij N(µ i, σ 2 ), Example: Y ij N(µ, σ 2 i ), j = 1,..., k; i = 1,..., m j = 1,..., k i ; i = 1,..., m Example: Y i1, Y i2 Bernoulli(p i1, p i2 ), i = 1,..., n Example: Y i1, Y i2 Exponential(λ i ψ, λ i /ψ) or ψλ i, ψ/λ i STA 4508: Topics in Likelihood Inference January 14, 2014 8/57

Frequentist inference, nuisance parameters first-order pivotal quantities r u (ψ) = l P (ψ)j P( ˆψ) 1/2. N(0, 1), r e (ψ) = ( ˆψ ψ)j P ( ˆψ) 1/2. N(0, 1), r(ψ) = sign( ˆψ ψ)[2{l P ( ˆψ) l P (ψ)}] 1/2. N(0, 1) all based on treating profile log-likelihood as a one-parameter log-likelihood example y = Xβ + ɛ, ɛ N(0, σ 2 ) ˆσ 2 = (y X ˆβ) T (y X ˆβ)/n STA 4508: Topics in Likelihood Inference January 14, 2014 10/57

log-likelihood -6-4 -2 0 3 4 5 6 7 8 ψ 1 2

Eliminating nuisance parameters by using marginal density f (y; ψ, λ) f m (t 1 ; ψ)f c (t 2 t 1 ; ψ, λ) Example N(Xβ, σ 2 I) : f (y; β, σ 2 ) f m (RSS; σ 2 )f c ( ˆβ RSS; β, σ 2 ) by using conditional density f (y; ψ, λ) f c (t 1 t 2 ; ψ)f m (t 2 ; ψ, λ) Example N(Xβ, σ 2 I) : f (y; β, σ 2 ) f c (RSS ˆβ; σ 2 )f m ( ˆβ; β, σ 2 ) STA 4508: Topics in Likelihood Inference January 14, 2014 12/57

Linear exponential families conditional density free of nuisance parameter f (y i ; ψ, λ) = exp{ψ T s(y i ) + λ T t(y i ) k(ψ, λ)}h(y i ) f (y; ψ, λ) = s = t = f (s, t; ψ, λ) = f (s t; ψ) = STA 4508: Topics in Likelihood Inference January 14, 2014 13/57

Adjusted profile log-likelihood l A (ψ) = l p (ψ) + A(ψ) = l(ψ, ˆλ ψ ) + A(ψ) A(ψ) assumed to be O p (1) generic form is A FR (ψ) = + 1 2 log j λλ(ψ, ˆλ ψ ) log d(λ) d ˆλ ψ Fraser, 2003 closely related A BN (ψ) = 1 2 log j λλ(ψ, ˆλ ψ ) + log d ˆλ d ˆλ ψ SM 12.4.1, BN 1983 if i ψλ (θ) = 0, then ˆλ ψ = ˆλ + O p (n 1 ), suggesting we ignore last term if ψ is scalar, then in principle we can find a parametrization (ψ, λ) in which i ψλ (θ) = 0 SM 12.4.2 STA 4508: Topics in Likelihood Inference January 14, 2014 14/57

Asymptotics for Bayesian inference exp{l(θ; y)}π(θ) π(θ y) = exp{l(θ; y)}π(θ)dθ expand numerator and denominator about ˆθ, assuming l (ˆθ) = 0 π(θ y). = N{ˆθ, j 1 (ˆθ)} expand denominator only about ˆθ result π(θ y). = 1 (2π) d/2 j(ˆθ) +1/2 exp{l(θ; y) l(ˆθ; y)} π(θ) π(ˆθ) STA 4508: Topics in Likelihood Inference January 14, 2014 15/57

Posterior is asymptotically normal π(θ y). N{ˆθ, j 1 (ˆθ)} θ R, y = (y 1,..., y n ) careful statement STA 4508: Topics in Likelihood Inference January 14, 2014 16/57

... posterior is asymptotically normal π(θ y). N{ˆθ, j 1 (ˆθ)} θ R, y = (y 1,..., y n ) equivalently l π (θ) = STA 4508: Topics in Likelihood Inference January 14, 2014 17/57

... posterior is asymptotically normal In fact, If π(θ) > 0 and π (θ) is continuous in a neighbourhood of θ 0, there exist constants D and n y s.t. F n (ξ) Φ(ξ) < Dn 1/2, for all n > n y, on an almost-sure set with respect to π(θ 0 )f (y; θ 0 ), where y = (y 1,..., y n ) is a sample from f (y; θ 0 ), and θ 0 is an observation from the prior density π(θ). F n (ξ) = Pr{(θ ˆθ)j 1/2 (ˆθ) ξ y} Johnson (1970); Datta & Mukerjee (2004) STA 4508: Topics in Likelihood Inference January 14, 2014 18/57

Laplace approximation π(θ y). = 1 (2π) 1/2 j(ˆθ) +1/2 exp{l(θ; y) l(ˆθ; y)} π(θ) π(ˆθ) π(θ y) = π(θ y) = 1 (2π) 1/2 j(ˆθ) +1/2 exp{l(θ; y) l(ˆθ; y)} π(θ) π(ˆθ) {1+O p(n 1 )} y = (y 1,..., y n ), θ R 1 1 (2π) 1/2 j π(ˆθ π ) +1/2 exp{l π (θ; y) l π (ˆθ π ; y)}{1+o p (n 1 )} STA 4508: Topics in Likelihood Inference January 14, 2014 19/57

Posterior tail area θ π(ϑ y)dϑ. = θ 1 (2π) 1/2 el(ϑ;y) l( ˆϑ;y) 1/2 π(ϑ) j( ˆϑ) π( ˆϑ) dϑ STA 4508: Topics in Likelihood Inference January 14, 2014 20/57

Posterior cdf θ π(ϑ y)dϑ. = θ 1 (2π) 1/2 el(ϑ;y) l( ˆϑ;y) 1/2 π(ϑ) j( ˆϑ) π( ˆϑ) dϑ SM, 11.3 STA 4508: Topics in Likelihood Inference January 14, 2014 21/57

BDR, Ch.3, Cauchy with flat prior

Nuisance parameters y = (y 1,..., y n ) f (y; θ), θ = (ψ, λ) π m (ψ y) = π(ψ, λ y)dλ = exp{l(ψ, λ; y)π(ψ, λ)dλ exp{l(ψ, λ; y)π(ψ, λ)dψdλ STA 4508: Topics in Likelihood Inference January 14, 2014 24/57

... nuisance parameters y = (y 1,..., y n ) f (y; θ), θ = (ψ, λ) π m (ψ y) = π(ψ, λ y)dλ = exp{l(ψ, λ; y)π(ψ, λ)dλ exp{l(ψ, λ; y)π(ψ, λ)dψdλ j(ˆθ) = j ψψ (ˆθ) j λλ (ˆθ) STA 4508: Topics in Likelihood Inference January 14, 2014 25/57

Posterior marginal cdf, d = 1 Π m (ψ y) =. = ψ ψ π m (ξ y)dξ 1 (2π) 1/2 elp(ξ) lp(ˆξ) j 1/2 p (ˆξ) π(ξ, ˆλ ξ ) j λλ (ˆξ, ˆλ) 1/2 π(ˆξ, ˆλ) j λλ (ξ, ˆλ ξ ) 1/2 dξ STA 4508: Topics in Likelihood Inference January 14, 2014 26/57

... posterior marginal cdf, d = 1 Π m (ψ y) r = r(ψ) =. = Φ(r B ) = Φ{r + 1 r log(q B r )} q B = q B (ψ) = STA 4508: Topics in Likelihood Inference January 14, 2014 27/57

normal circle, k=2 p value 0.0 0.2 0.4 0.6 0.8 1.0 2 3 4 5 6 7 8 STA 4508: Topics in Likelihood Inference January 14, 2014 28/57 ψ

normal circle, k=2 p value 0.0 0.2 0.4 0.6 0.8 1.0 2 3 4 5 6 7 8 STA 4508: Topics in Likelihood Inference January 14, 2014 29/57 ψ

normal circle, k=2 p value 0.0 0.2 0.4 0.6 0.8 1.0 2 3 4 5 6 7 8 STA 4508: Topics in Likelihood Inference January 14, 2014 30/57 ψ

normal circle, k = 2, 5, 10 p value 0.0 0.2 0.4 0.6 0.8 1.0 2 3 4 5 6 7 8 STA 4508: Topics in Likelihood Inference January 14, 2014 31/57 ψ

normal circle, k = 2, 5, 10 p value 0.0 0.2 0.4 0.6 0.8 1.0 2 3 4 5 6 7 8 STA 4508: Topics in Likelihood Inference January 14, 2014 32/57 ψ

normal circle, k = 2, 5, 10 p value 0.0 0.2 0.4 0.6 0.8 1.0 2 3 4 5 6 7 8 STA 4508: Topics in Likelihood Inference January 14, 2014 33/57 ψ

normal circle, k = 2, 5, 10 p value 0.0 0.2 0.4 0.6 0.8 1.0 2 3 4 5 6 7 8 STA 4508: Topics in Likelihood Inference January 14, 2014 34/57 ψ

Link to adjusted log-likelihoods π m (ψ y). = 1 (2π) d/2 elp(ψ) lp( ˆψ) j 1/2 p ( ˆψ) π(ψ, ˆλ ψ ) π( ˆψ, ˆλ) j λλ ( ˆψ, ˆλ) 1/2 j λλ (ψ, ˆλ ψ ) 1/2 π m (ψ y) =. c exp{l p (ψ) 1 2 log j λλ(ψ, ˆλ ψ ) + log π(ψ, ˆλ ψ )} l A (ψ) = l p (ψ) 1 2 log j d ˆλ λλ(ψ, ˆλ ψ ) + log d ˆλ ψ if i ψλ (θ) = 0, then ˆλ ψ = ˆλ + O p (n 1 ) STA 4508: Topics in Likelihood Inference January 14, 2014 35/57