Lecture 3: Asymptotic Normality of M-estimators

Σχετικά έγγραφα
Lecture 17: Minimum Variance Unbiased (MVUB) Estimators

Homework for 1/27 Due 2/5

true value θ. Fisher information is meaningful for families of distribution which are regular: W (x) f(x θ)dx

1. For each of the following power series, find the interval of convergence and the radius of convergence:

SUPERPOSITION, MEASUREMENT, NORMALIZATION, EXPECTATION VALUES. Reading: QM course packet Ch 5 up to 5.6

LAD Estimation for Time Series Models With Finite and Infinite Variance

Solutions: Homework 3

The Equivalence Theorem in Optimal Design

Last Lecture. Biostatistics Statistical Inference Lecture 19 Likelihood Ratio Test. Example of Hypothesis Testing.

Introduction of Numerical Analysis #03 TAGAMI, Daisuke (IMI, Kyushu University)

Endogoneity and All That

p n r

Three Classical Tests; Wald, LM(Score), and LR tests

Other Test Constructions: Likelihood Ratio & Bayes Tests

Solve the difference equation

1. Matrix Algebra and Linear Economic Models

n r f ( n-r ) () x g () r () x (1.1) = Σ g() x = Σ n f < -n+ r> g () r -n + r dx r dx n + ( -n,m) dx -n n+1 1 -n -1 + ( -n,n+1)


On Generating Relations of Some Triple. Hypergeometric Functions

Proof of Lemmas Lemma 1 Consider ξ nt = r

Solution Series 9. i=1 x i and i=1 x i.

Lecture 21: Properties and robustness of LSE

3. SEEMINGLY UNRELATED REGRESSIONS (SUR)

Outline. Detection Theory. Background. Background (Cont.)

Ψηφιακή Επεξεργασία Εικόνας

6.3 Forecasting ARMA processes

EE 570: Location and Navigation

Homework 3 Solutions

Degenerate Perturbation Theory

Outline. M/M/1 Queue (infinite buffer) M/M/1/N (finite buffer) Networks of M/M/1 Queues M/G/1 Priority Queue

Second-order asymptotic comparison of the MLE and MCLE of a natural parameter for a truncated exponential family of distributions

Homework 4.1 Solutions Math 5110/6830

Homework 8 Model Solution Section

EE512: Error Control Coding

FREE VIBRATION OF A SINGLE-DEGREE-OF-FREEDOM SYSTEM Revision B

Second Order Partial Differential Equations

6. MAXIMUM LIKELIHOOD ESTIMATION

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

The Heisenberg Uncertainty Principle

Finite Field Problems: Solutions

Section 9.2 Polar Equations and Graphs

2 Composition. Invertible Mappings

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

Diane Hu LDA for Audio Music April 12, 2010

Econ Spring 2004 Instructor: Prof. Kiefer Solution to Problem set # 5. γ (0)

4.6 Autoregressive Moving Average Model ARMA(1,1)

The Simply Typed Lambda Calculus

Instruction Execution Times

Lecture 7: Overdispersion in Poisson regression

IIT JEE (2013) (Trigonomtery 1) Solutions

b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

CRASH COURSE IN PRECALCULUS

Supplemental Material: Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction

Solutions to Exercise Sheet 5

Inverse trigonometric functions & General Solution of Trigonometric Equations

Example Sheet 3 Solutions

Approximation of distance between locations on earth given by latitude and longitude

Partial Differential Equations in Biology The boundary element method. March 26, 2013

On Certain Subclass of λ-bazilevič Functions of Type α + iµ

EN40: Dynamics and Vibrations

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

Matrices and Determinants

PARTIAL NOTES for 6.1 Trigonometric Identities

Statistical Inference I Locally most powerful tests

Srednicki Chapter 55

A NEW CLASS OF SEMI-PARAMETRIC ESTIMATORS OF THE SECOND ORDER PARAMETER *

A study on generalized absolute summability factors for a triangular matrix

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Στα επόμενα θεωρούμε ότι όλα συμβαίνουν σε ένα χώρο πιθανότητας ( Ω,,P) Modes of convergence: Οι τρόποι σύγκλισης μιας ακολουθίας τ.μ.

Online Appendix for Measuring the Dark Matter in Asset Pricing Models

Concrete Mathematics Exercises from 30 September 2016

Higher Order Properties of Bootstrap and Jackknife Bias Corrected Maximum Likelihood Estimators

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

Congruence Classes of Invertible Matrices of Order 3 over F 2

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Probability theory STATISTICAL MODELING OF MULTIVARIATE EXTREMES, FMSN15/MASM23 TABLE OF FORMULÆ. Basic probability theory

INTEGRATION OF THE NORMAL DISTRIBUTION CURVE

Fractional Colorings and Zykov Products of graphs

Lecture 34 Bootstrap confidence intervals

(6,5 μονάδες) Θέμα 1 ο. Τμήμα Πολιτικών Μηχανικών Σχολή Τεχνολογικών Εφαρμογών Διεθνές Πανεπιστήμιο Ελλάδος ΟΝΟΜΑΤΕΠΩΝΥΜΟ

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Μηχανική Μάθηση Hypothesis Testing

α β

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

FORMULAS FOR STATISTICS 1

Section 8.3 Trigonometric Equations

F19MC2 Solutions 9 Complex Analysis

derivation of the Laplacian from rectangular to spherical coordinates

Every set of first-order formulas is equivalent to an independent set

DERIVATION OF MILES EQUATION Revision D

Biorthogonal Wavelets and Filter Banks via PFFS. Multiresolution Analysis (MRA) subspaces V j, and wavelet subspaces W j. f X n f, τ n φ τ n φ.

TMA4115 Matematikk 3

J. of Math. (PRC) Shannon-McMillan, , McMillan [2] Breiman [3] , Algoet Cover [10] AEP. P (X n m = x n m) = p m,n (x n m) > 0, x i X, 0 m i n. (1.

Bessel function for complex variable

Transcript:

Lecture 3: Asymptotic Istructor: Departmet of Ecoomics Staford Uiversity Prepared by Webo Zhou, Remi Uiversity

Refereces Takeshi Amemiya, 1985, Advaced Ecoometrics, Harvard Uiversity Press Newey ad McFadde, 1994, Chapter 36, Volume 4, The Hadbook of Ecoometrics.

Asymptotic Normality The Geeral Framework Everythig is just some form of first order Taylor Expasio: Q (ˆθ) = 0 Q (θ 0 ) + ) (ˆθ 2 Q (θ ) θ0 = 0. ) ( (ˆθ 2 Q (θ ) 1 ) Q (θ 0 ) θ0 = ( LD 2 ) 1 Q (θ 0 ) Q (θ 0 ) d = N ( 0, A 1 BA 1) where ( 2 ) ( ) Q (θ 0 ) Q (θ 0 ) A = E, B = Var

Asymptotic Normality for MLE I MLE, Q(θ) = 1 Iformatio matrix: log L(θ). 2 Q (θ) = 1 2 log L(θ). E 2 log L (θ 0 ) = E log L (θ 0) log L (θ 0 ). by usig iterchage of itegratio ad differetiatio. So A = B, ad (ˆθ θ0 ) d N ( 0, A 1) = N ( 0, ( lim 1 ) 1 ) E 2 log L (θ). What if iterchagig itegratio ad differetiatio is ot possible? Example: If y (θ, ), the E log f (y;θ) = f (θ).

Asymptotic Normality for GMM Q (θ) = g (θ) Wg (θ), g (θ) = 1 t=1 g (z t, θ). Asymptotic ormality holds whe the momet fuctios oly have first derivatives. Deote G (θ) = g(θ), θ [θ 0, ˆθ], Ĝ G (ˆθ), G G (θ ), G = EG (θ 0 ), Ω = E ( g (z, θ 0 ) g (z, θ 0 ) ). 0 = Ĝ Wg (ˆθ) = Ĝ W ( ) g (θ 0 ) + G (ˆθ θ 0 ) = (ˆθ θ 0 ) = (Ĝ WG ) 1 Ĝ W g (θ 0 ) LD = (G WG) 1 G W g (θ 0 ) LD = (G WG) 1 G W N (0, Ω) = N (0, (G WG) 1 G W ΩWG (G WG) 1)

Examples Efficiet choice of W = Ω 1 (or W Ω 1 ), ) ( d (ˆθ θ 0 N 0, ( G Ω 1 G ) ) 1. Whe G is ivertible, W is irrelevat, ) ( (ˆθ ) ( d θ0 N 0, G 1 ΩG 1 = N 0, ( G Ω 1 G ) ) 1. Whe Ω = αg(or G Ω), ( ) d ˆβ β 0 N ( 0, αg 1).

Least square (LS): g (z, β) = x (y xβ). G = Exx, Ω = Eε 2 xx, the ( ) ( d ˆβ β0 N 0, (Exx ) 1 ( Eε 2 xx ) (Exx ) 1), the so-called White s heteroscedasticity cosistecy stadard error. If E [ ε 2 x ] = σ 2, the Ω = σ 2 G ad ( ˆβ β 0 ) ( d N 0, σ 2 (Exx ) 1). Weighted LS: g (z, β) = 1 E(ε 2 x) (y x β). G = E 1 E(ε 2 x) xx = Ω = ( ˆβ β 0 ) d N (0, G).

Liear 2SLS: g (z, β) = z (y xβ). G = Ezx, Ω = Eε 2 zz, W = (Ezz ) 1, the ( ) d ˆβ β0 N (0, V ). If Eε 2 zz = σ 2 Ezz, V = σ 2 [Exz (Ezz ) 1 Ezx ] 1. Liear 3SLS: g (z, β) = z (y xβ). G = Ezx, Ω = Eε 2 zz, W = ( Eε 2 zz ) 1, the ( ) [ d ˆβ β 0 N (0, V ) for V = Exz ( Eε 2 zz ) 1 1 Ezx ]. MLE as GMM: g (z, θ) = G = E 2 log f (z,θ) = Ω = E ) (ˆθ θ log f (z,θ). log f (z,θ) log f (z,θ), the d N ( 0, G 1) = N (0, Ω).

GMM agai: Take liear combiatios of the momet coditios to make Number of g (z, θ) = Number of θ. I particular, take h (z, θ) = G Wg (z, θ) ad use h (z, θ) as the ew momet coditios, the ˆθ = argmax θ [ 1 ] [ ] 1 h (z t, θ) h (z t, θ) t=1 t=1 is asymptotically equivalet to ˆθ = argmax θ g Wg, where G = E h(z,θ) = G WG, Ω = Eh (z, θ) h (z, θ) = G W ΩWG.

Quatile Regressio as GMM: g (z, β) = (τ 1 (y x β)) x, ad W is irrelevat. G = E g(z,β) β = E 1(y x β)x β. Proceedig with a quick ad dirty way take expectatio before takig differetiatio: G = E1 (y x β) x β =Ex F (y x β x) β = ExF (y x β x) β = Ef y (x β x) xx = Ef u (0 x) xx. Coditioal o x, [ τ 1 (y x β 0 ) = τ ] 1 (u 0) is a Beroulli r.v. E (τ 1 (y x β 0 )) 2 x = τ (1 τ), the Ω = EE [ ] (τ 1 (y x β 0 )) 2 x xx = τ (1 τ) Exx.

Quatile Regressio as GMM: ( ( ˆβ β 0 ) d N 0, τ (1 τ) [Ef u (0 x) xx ] 1 Exx [Ef u (0 x) xx ] 1). f (0 x) = f (0) if homoscedastic, the V = τ(1 τ) f (0) Exx. Cosistet estimatio of G ad Ω: Estimated by G. = 1 t=1 g(z t,ˆθ). For osmooth problems as quatile regressio, use Q (ˆθ+2h )+Q (ˆθ 2h ) 2Q(ˆθ) 4h 2 Require h = o (1) ad 1/h = o ( 1/ ). to approximate. For statioary data, heteroscedasticity ad depedece will oly affect estimatio of Ω. For idepedet data, use White s heteroscedasticity-cosistet estimate; for depedet data, use Newey-West s autocorrelatio-cosistet estimate.

Iteratio ad Oe Step Estimatio The iitial guess θ the ext roud guess θ. Newto-Raphso, use quadratic approximatio for Q (θ). Gauss-Newto, use liear approximatio for the first-order coditio, e.g. GMM. If the iitial guess is a cosistet estimate, more iteratio will ot icrease (first-order) asymptotic efficiecy. ) ( ) e.g. ( θ θ 0 = O 1 p, the ( θ ) LD= θ 0 (ˆθ θ 0 ), for ˆθ = argmax θ Q (θ).

1 Newto-Raphso, Use quadratic approximatio for Q (θ): ) ) ) Q ( θ ( Q (θ) Q ( θ + θ θ ) + 1 ( θ 2 θ ) 2 Q ( θ ( θ θ θ ) = 0. ) ) Q ( θ 2 Q ( θ ) = + ( θ θ = 0. ) 1 ( θ) = θ = θ 2 Q ( θ Q 2 Gauss-Newto, use liear approximatio for the first-order coditio, e.g. GMM: ( ) ( )) ( ) Q (θ) g ( θ + G θ θ W g ( θ + G ) ) = G Wg ( θ + G W G ( θ θ = 0. = θ = θ ( G) 1 ) G W GWg ( θ ( )) θ θ

If the iitial guess is a ( ) ( ) cosistet estimate, e.g. β β 0 = O 1 p, the ( θ ) LD= θ 0 (ˆθ θ 0 ), for ˆθ = argmax θ Q (θ). More iteratio will ot icrease (first-order) asymptotic efficiecy:

1 For Newto-Raphso: ( θ) ) ) ( θ θ 0 = ( θ θ 0 2 Q 1 ) Q ( θ ( θ) = ) ( θ θ 0 2 Q 1 [ Q (θ ) ] 0) + ( θ 2 Q (θ ) θ 0 ( θ) 1 ( θ) 1 = I 2 Q 2 Q (θ ) ) ( θ θ0 2 Q Q (θ 0) = o p (1) + ) (ˆθ θ0 2 For Gauss-Newto: ( θ θ0 ) = ( θ θ0 ) ( G) 1 G W G W [g ( θ )] (θ 0) + G θ0 ( ( ) ) 1 ) ( 1 = I G W G G WG ( θ θ 0 G W G) G W g (θ 0) = o p (1) + ) ( θ θ0

Ifluece Fuctio φ (z t ) is called ifluece fuctio if (ˆθ θ 0 ) = 1 t=1 φ (z t) + o p (1), Eφ (z t ) = 0, Eφ (z t ) φ (z t ) <. Thik of (ˆθ θ 0 ) distributed as φ (z t ) N ( 0, Eφφ ). Used for discussio of asymptotic efficiecy, two step or multistep estimatio, etc.

Examples For MLE, [ φ (z t ) = E 2 l f (y t, θ 0 ) For GMM, [ = E l f (y t, θ 0 ) ] 1 l f (yt, θ 0 ) ] l f (y t, θ 0 ) 1 l f (y t, θ 0 ). or φ = ( G WG ) 1 G Wg (z t, θ 0 ), ( φ = E h ) 1 h (z t, θ 0 ) for h (z t, θ 0 ) = G Wg (z t, θ 0 ). Quatile Regressio: φ (z t ) = [ Ef (0 x) xx ] 1 (τ 1 (u 0)) xt.

Asymptotic Efficiecy Is MLE efficiet amog all asymptotically ormal estimators? Superefficiet estimator: Suppose d (ˆθ θ 0 ) N (0, V ) for all θ. Now defie { ˆθ θ if ˆθ 1/4 = 0 if ˆθ < 1/4 d the (θ θ 0 ) N (0, 0) if θ 0 = 0, ad (θ θ 0 ) LD = d (ˆθ θ 0 ) N (0, V ) if θ 0 0. ˆθ is regular if for ay data geerated by θ = θ 0 + δ/, for δ 0, (ˆθ θ 0 ) has a limit distributio that does ot deped o δ.

For regular estimators, ifluece fuctio represetatio idexed by τ, (ˆθ (τ) θ 0 ) LD = φ (z, τ) N ( 0, Eφ (τ) φ (τ) ), ˆθ ( τ) is efficiet tha ˆθ (τ) if it has a smaller var-cov matrix. A ecessary coditio is that Cov (φ (z, τ) φ (z, τ), φ (z, τ)) = 0 for all τ icludig τ. The followig are equivalet: Cov (φ (z, τ) φ (z, τ), φ (z, τ)) = 0 Cov (φ (z, τ), φ (z, τ)) = Var (φ (z, τ)) Eφ (z, τ) φ (z, τ) = Eφ (z, τ) φ (z, τ)

Newey s efficiecy framework: Classify estimators ito the GMM framework with φ (z, τ) = D (τ) 1 m (z, τ). For the class idexed by τ = W, give a vector g (z, θ 0 ), D (τ) D (W ) = G WG ad m (z, τ) m (z, W ) = G Wg (z, θ 0 ). Cosider MLE amog the class of GMM estimators, so that τ idexes ay vector of momet fuctio havig the same dimesio as θ. I this case, D (τ) D (h) = E h ad m (z, τ) = h (z t, θ 0 ).

For this particular case where φ (z, τ) = D (τ) 1 m (z, τ), Eφ (z, τ) φ (z, τ) = Eφ (z, τ) φ (z, τ) = D (τ) 1 Em (z, τ) m (z, τ) D ( τ) 1 = D ( τ) 1 Em (z, τ) m (z, τ) D ( τ) 1. If τ satisfies D (τ) = Em (z, τ) m (z, τ) for all τ, the both sides above are the same D ( τ) 1 ad so efficiet. Examples. Check D (τ) = Em (z, τ) m (z, τ). GMM with optimal weightig matrix: D (τ) = G WG, m (z, τ) = m (z, W ) = G Wg(z, θ 0 ). To check D (τ) = Em (z, τ) m (z, τ) = G W Ω W G, G WG = G W Ω W G = Ω W = I = W = Ω 1.

MLE better tha ay GMM: D (τ) = E h(z,θ 0), m (z, τ) = h (z, θ 0 ). To check D (τ) = Eh (z, θ 0 ) h (z, θ 0 ), use the geeralized iformatio matrix equality: 0 = Eh (z, θ 0) = h (z, θ) f (z, θ) dz h (z, θ) l f (z, θ) = f (z, θ) dz + h (z, θ) f (z, θ) dz = E h (z, θ 0) + Eh (z, θ 0 ) l f (z, θ 0) = h (z, θ 0 ) = l f (y,θ 0), the score fuctio for MLE.

Two Step Estimator Geeral Framework: First step estimator (ˆγ γ 0 ) = 1 t=1 φ (z t) + o p (1). Estimate ˆθ by Q (ˆθ, ˆγ) = 1 t=1 q(z t, ˆθ, ˆγ) = 0 Let = 1 h(z t, ˆθ, ˆγ). t=1 Let H (z, θ, γ) = h (z, θ, γ), Γ (z, θ, γ) = H = EH (z t, θ 0, γ 0 ), Γ = EΓ (z, θ 0, γ 0 ) ; h = h (θ 0, γ 0 ). h (z, θ, γ) ; γ

1 The just taylor expad: h (z t, ˆθ, ) ˆγ = 0 1 h (θ0, ˆγ) + 1 H (θ, ˆγ) ) (ˆθ θ0 = 0 = ) [ 1 1 (ˆθ θ0 = H (θ 1, ˆγ)] h (θ0, ˆγ) [ LD = H 1 1 h (θ0, γ 0 ) + 1 Γ (θ0, γ ) ] (ˆγ γ 0 ) [ LD = H 1 1 ( 1 )] h + Γ φ (zt ) + o p (1) [ LD = H 1 1 1 ] h + Γ φ (zt ). So that ) d (ˆθ θ0 N (0, V ) for V = H 1 E (h + Γφ) (h + φ Γ ) H 1.

GMM both first stage ˆγ ad secod stage ˆθ: φ = M 1 m (z), for some momet coditio m (z, γ). h (θ, ˆγ) = G Wg (z, θ, ˆγ) so that H = G WG, Γ = G W g γ G WG γ for G γ g γ. Plug these ito the above geeral case. If W = I, ad G is ivertible, the this simplies to V = G 1 [ Ω + (Egφ ) G γ + G γ (Eφg ) + G γ (Eφφ ) G γ] G 1. Agai if you have trouble differetiatig g(θ,γ) or g(θ,γ) γ, the simply take expectatio before differetiatio, just replace H ad Γ by Eg(θ,γ) ad Eg(θ,γ) γ.