More Notes on Testing. Large Sample Properties of the Likelihood Ratio Statistic. Let X i be iid with density f(x, θ). We are interested in testing

Σχετικά έγγραφα
6. MAXIMUM LIKELIHOOD ESTIMATION

Other Test Constructions: Likelihood Ratio & Bayes Tests

2 Composition. Invertible Mappings

ST5224: Advanced Statistical Theory II

Statistical Inference I Locally most powerful tests

Μηχανική Μάθηση Hypothesis Testing

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Exercise 2: The form of the generalized likelihood ratio

Math221: HW# 1 solutions

Homework 3 Solutions

Theorem 8 Let φ be the most powerful size α test of H

EE512: Error Control Coding

Lecture 34 Bootstrap confidence intervals

Section 7.6 Double and Half Angle Formulas

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

Section 8.3 Trigonometric Equations

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

4.6 Autoregressive Moving Average Model ARMA(1,1)

C.S. 430 Assignment 6, Sample Solutions

Matrices and Determinants

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Solution Series 9. i=1 x i and i=1 x i.

Tridiagonal matrices. Gérard MEURANT. October, 2008

= λ 1 1 e. = λ 1 =12. has the properties e 1. e 3,V(Y

Notes on the Open Economy

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

The Simply Typed Lambda Calculus

Areas and Lengths in Polar Coordinates

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

Every set of first-order formulas is equivalent to an independent set

Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013

Last Lecture. Biostatistics Statistical Inference Lecture 19 Likelihood Ratio Test. Example of Hypothesis Testing.

6.3 Forecasting ARMA processes

Math 6 SL Probability Distributions Practice Test Mark Scheme

Areas and Lengths in Polar Coordinates

Problem Set 3: Solutions

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

An Introduction to Signal Detection and Estimation - Second Edition Chapter II: Selected Solutions

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

Lecture 7: Overdispersion in Poisson regression

Exercises to Statistics of Material Fatigue No. 5

Concrete Mathematics Exercises from 30 September 2016

Approximation of distance between locations on earth given by latitude and longitude

Numerical Analysis FMN011

557: MATHEMATICAL STATISTICS II RESULTS FROM CLASSICAL HYPOTHESIS TESTING

Lecture 2. Soundness and completeness of propositional logic

derivation of the Laplacian from rectangular to spherical coordinates

Lecture 21: Properties and robustness of LSE

STAT200C: Hypothesis Testing

b. Use the parametrization from (a) to compute the area of S a as S a ds. Be sure to substitute for ds!

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Second Order Partial Differential Equations

ORDINAL ARITHMETIC JULIAN J. SCHLÖDER

A Note on Intuitionistic Fuzzy. Equivalence Relation

Partial Differential Equations in Biology The boundary element method. March 26, 2013

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

5.4 The Poisson Distribution.

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Second Order RLC Filters

Finite Field Problems: Solutions

Parametrized Surfaces

w o = R 1 p. (1) R = p =. = 1

( y) Partial Differential Equations

Section 9.2 Polar Equations and Graphs

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

Durbin-Levinson recursive method

Example Sheet 3 Solutions

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING

Inverse trigonometric functions & General Solution of Trigonometric Equations

2. THEORY OF EQUATIONS. PREVIOUS EAMCET Bits.

Congruence Classes of Invertible Matrices of Order 3 over F 2

( ) 2 and compare to M.

Supplementary Appendix

5. Choice under Uncertainty

FORMULAS FOR STATISTICS 1

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

9.09. # 1. Area inside the oval limaçon r = cos θ. To graph, start with θ = 0 so r = 6. Compute dr

forms This gives Remark 1. How to remember the above formulas: Substituting these into the equation we obtain with

CRASH COURSE IN PRECALCULUS

Higher Derivative Gravity Theories

Fractional Colorings and Zykov Products of graphs

Srednicki Chapter 55

Example of the Baum-Welch Algorithm

Optimal Parameter in Hermitian and Skew-Hermitian Splitting Method for Certain Two-by-Two Block Matrices

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 6/5/2006

HISTOGRAMS AND PERCENTILES What is the 25 th percentile of a histogram? What is the 50 th percentile for the cigarette histogram?

Solutions to Exercise Sheet 5

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Math 446 Homework 3 Solutions. (1). (i): Reverse triangle inequality for metrics: Let (X, d) be a metric space and let x, y, z X.

Lecture 26: Circular domains

Mean-Variance Analysis

DESIGN OF MACHINERY SOLUTION MANUAL h in h 4 0.

Lecture 12: Pseudo likelihood approach

Uniform Convergence of Fourier Series Michael Taylor

Transcript:

Ulrich Müller Economics 2110 Fall 2008 More Notes on Testing Large Sample Properties of the Likelihood Ratio Statistic Let X i be iid with density f(x, θ). We are interested in testing H 0 : θ = θ 0 against H 1 : θ 6= θ 0 where θ is k 1 using a likelihood ratio test. To carry out the test, we need to determine the appropriate critical value c. Recall that c is determined by the requirement that P (LR >c H 0 )=α. In order to determine the critical value, we thus need to determine the distribution of LR when the null hypothesis is true. We now develop a large sample approximation to solve this problem. Let ˆθ =argmax θ L(θ) denote the mle, and write the maximized likelihood ratio statistic as LR = L(ˆθ) L(θ 0 ) Define the statistic ξ LR =2ln(LR)=2(L n (ˆθ) L n (θ 0 )) where L n (θ) =lnl(θ). Since ξ LR is a monotonic transformation of LR, thelr test can be implemented by rejecting the null hypothesis when ξ LR is large. To find the approximate distribution of ξ LR under the null hypothesis, write L n (θ 0 )=L n (ˆθ)+(θ 0 ˆθ) 0 L n(ˆθ) + 1(θ 2 0 ˆθ) 0 2 L n ( θ) 0 (θ 0 ˆθ) where θ(ω) is between θ 0 and ˆθ(ω). Since Ln(ˆθ) =0bytheFOCofthemle,wehave ξ LR = (θ 0 ˆθ) 0 2 L n ( θ) 0 (θ 0 ˆθ) = n(ˆθ θ 0 ) 0 1 2 L n ( θ) n(ˆθ n 0 θ0 ) 1

Proceeding as in our derivations of the properties of the maximum likelihood estimator, n(ˆθ θ0 ) N (0, Ī(θ 0 ) 1 ) so that by Slutzky and the CMT, 1 2 L n ( θ) n 0 = 1 S( θ) n 0 = n 1 nx H ξ 0 LR χ 2 k s i ( θ) 0 p Ī(θ0 ) An asymptotically justified level 1 α confidence set based on the LR statistic is hence of the form {θ (ˆθ θ ) 0 ˆV 1 (ˆθ θ ) <c} where ˆV ³ 1 = 1 2 L n( θ) n and c solves P (χ 2 0 k > c) = α. In large samples, where ˆV ' V = Ī(θ 0) 1,thisconfidence set may be recognized as the interior of an ellipse centered at θ = ˆθ. In the one-dimensional case, we obtain a confidence interval [ˆθ c ˆV 1/2, ˆθ + c ˆV 1/2 ] where c is the positive number that solves P (N (0, 1) >c )=α/2. LargeSamplePropertiesoftheWaldStatistic A close cousin of the LR statistic is the Wald statistic ξ W = n(ˆθ θ 0 ) 0 1 2 L n (ˆθ) n(ˆθ n 0 θ0 ) which differs from ξ LR only because the estimated information matrix is evaluated at ˆθ rather than θ. Note that we can compute the Wald statistic without doing any computations under the null hypothesis. Since both ˆθ and θ converge in probability to θ 0 under the null hypothesis, p,h 0 ξ W ξ LR 0 2

ThemotivationoftheWaldstatisticisthatunderthenullhypothesis,thedifference between the estimator ˆθ and θ 0 satisfies n(ˆθ θ0 ) a N (0, Ī(θ 0) 1 ) and 1 2 L n (ˆθ) consistently estimates Ī(θ n 0 0 ). Under the alternative, ˆθ θ 0 is large, and we reject. <<Note that the Wald statistic is not invariant to reparametrizations of the parameters, unlike the LR statistic.>> Power considerations: Let s consider power against alternatives of the form θ 1n = θ 0 + g/ n for some nonzero k 1 vector g.<<strictly speaking, a true value of θ that depends on n makes the data a double array with observations X n,j,whereforany n, X n,1,x n,2,,x n,n are i.i.d. with density f(x, θ 1n ). We haven t talked about that before, but our proofs of LLN and CLTs work equally well for double arrays, so this ends up being only a notational nuisance...>>. Proceeding as above, we now find n(ˆθ θ1n ) N (0, Ī(θ 0 ) 1 ) which implies n(ˆθ θ0 ) N (g, Ī(θ 0 ) 1 ) so that ξ W converges to a quadratic form in multivariate normals with a nonzero mean. The factor n 1/2 reflects that in larger samples, there is more information about the parameter θ, sothatwehavetoletalternativesshrinktoθ 0 to obtain nontrivial asymptotic results. The appropriate rate for this to happen is sometimes called Pitman drift. LargeSamplePropertiesoftheLMStatistic Another approximation to ξ LR is given by the Lagrange Multiplier test statistic µ ξ LM = n 1/2 S n (θ 0 ) 0 1 2 1 L n (θ 0 ) n 0 n 1/2 S n (θ 0 ) 0 nx µ = n 1/2 s i (θ 0 ) 1 2 1 L n (θ 0 ) nx n 0 n 1/2 s i (θ 0 ) with the advantage that we do not need to compute ˆθ in order to compute ξ LM. Since under the null hypothesis n 1/2 nx s i (θ 0 ) N (0, Ī(θ 0 )) 3

and 1 2 L n (θ 0 ) p n 0 Ī(θ0 ) H we also find ξ 0 LM χ 2 p,h 0 k. (In fact, ξ LM ξ LR 0, as you should check for yourself). Nuisance Parameters in LR, Wald and LM Tests In many applications, a null hypothesis specifies values for some of the parameters, but leaves others unrestricted. How does this affect the LR, Wald and LM testing procedures? Let θ =(β 0,γ 0 ) 0,whereβ is p 1 and θ is k 1, p<k. Suppose we are interested in testing H 0 : β = β 0 against H 1 : β 6= β 0 with γ unrestricted. Let ˆθ =(ˆβ 0, ˆγ 0 ) 0 be the mle, and let ˆθ r =(β 0 0, ˆγ 0 r) 0 be the estimator that maximizes L n (θ) under the restriction that β = β 0. The Wald tests is entirely straightforward to implement, based on the test statistic ξ W = n(ˆβ β 0 ) 0 1 ˆV ββ n(ˆβ β0 ) ³ with ˆV ββ the upper left p p block of 1 2 L n 1. (ˆθ) n Under the null hypothesis, 0 n(ˆβ β0 ) N (0,V ββ ) with V ββ the p p upper left block of Ī(θ 0 ) 1 p,and ˆV ββ Vββ, so that ζ W χ 2 p under the null hypothesis. We now consider the properties of the generalized likelihood ratio statistic We have ξ LR = 2ln(LR gen ) = 2(sup θ L n (θ) sup L n (θ)) γ = 2(L n (ˆθ) L n (ˆθ r )). L n (ˆθ r )=L n (ˆθ)+(ˆθ r ˆθ) 0 L n(ˆθ) + 1(ˆθ 2 r ˆθ) 0 2 L n ( θ) 0 (ˆθ r ˆθ) so that with n 1 2 L n ( θ) 0 p Ī(θ0 ) and L n(ˆθ) =0 ξ LR = n(ˆθ ˆθ r ) 0 Ī(θ 0 ) n(ˆθ ˆθ r )+o p (1). 4

Also, from familiar arguments n 1/2 S n (ˆθ) =0=n 1/2 S n (ˆθ r ) Ī(θ 0 ) n(ˆθ ˆθ r )+o p (1) so that we only need to figure out what the properties of n 1/2 S n (ˆθ r ) are. Let subscripts β and γ denote the appropriate subvectors and submatrices corresponding to β and γ in the score and the information matrix. By another Taylor expansion, n 1/2 S n (ˆθ r )=n 1/2 S n (θ 0 ) Ī(θ 0) n(ˆθ r θ 0 )+o p (1) Note that by the FOC of ˆθ r,thelastk p elements of S n (ˆθ r ) are zero (S γn (ˆθ r )=0), so that we only need to compute S βn (ˆθ r ). Note also that the first p elements of (ˆθ r θ 0 ) are zero. Define the k k matrix Then W = I p Ī βγ (θ 0 )Ī γγ (θ 0 ) 1 0 0 n 1/2 WS n (ˆθ r ) = n 1/2 S n (ˆθ r ) = n 1/2 WS n (θ 0 ) W Ī(θ 0)n 1/2 (ˆθ r θ 0 )+o p (1) = n 1/2 WS n (θ 0 )+o p (1) since the upper right block of W Ī(θ 0) is zero. Note that n 1/2 WS n (θ 0 ) N (0,WĪ(θ 0 )W 0 Īββ (θ 0 ) Ī βγ (θ 0 )Ī γγ (θ 0 ) 1 Ī γβ (θ 0 ) 0 ) N (0, ) 0 0 We conclude that ξ LR = n(ˆθ ˆθ r ) 0 Ī(θ 0 )Ī(θ 0 ) 1 Ī(θ 0 ) n(ˆθ ˆθ r )+o p (1) = n 1/2 S n (θ 0 ) 0 W 0 Ī(θ 0 ) 1 Wn 1/2 S n (θ 0 )+o p (1) χ 2 p under the null hypothesis, since the upper left p p block of Ī(θ 0 ) 1 (Īββ(θ 0 ) Īβγ(θ 0 )Īγγ(θ 0 ) 1 Ī γβ (θ 0 )) 1. is equal to 5