557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING

Σχετικά έγγραφα
557: MATHEMATICAL STATISTICS II RESULTS FROM CLASSICAL HYPOTHESIS TESTING

ST5224: Advanced Statistical Theory II

Other Test Constructions: Likelihood Ratio & Bayes Tests

Statistical Inference I Locally most powerful tests

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Theorem 8 Let φ be the most powerful size α test of H

STAT200C: Hypothesis Testing

2 Composition. Invertible Mappings

Every set of first-order formulas is equivalent to an independent set

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Homework 3 Solutions

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

EE512: Error Control Coding

An Introduction to Signal Detection and Estimation - Second Edition Chapter II: Selected Solutions

Solution Series 9. i=1 x i and i=1 x i.

Areas and Lengths in Polar Coordinates

Problem Set 3: Solutions

Exercise 2: The form of the generalized likelihood ratio

Areas and Lengths in Polar Coordinates

Solutions to Exercise Sheet 5

Uniform Convergence of Fourier Series Michael Taylor

C.S. 430 Assignment 6, Sample Solutions

Matrices and Determinants

Example Sheet 3 Solutions

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

Section 8.3 Trigonometric Equations

Fractional Colorings and Zykov Products of graphs

Μηχανική Μάθηση Hypothesis Testing

6.3 Forecasting ARMA processes

Second Order RLC Filters

A Note on Intuitionistic Fuzzy. Equivalence Relation

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

The Simply Typed Lambda Calculus

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

Jesse Maassen and Mark Lundstrom Purdue University November 25, 2013

4.6 Autoregressive Moving Average Model ARMA(1,1)

Inverse trigonometric functions & General Solution of Trigonometric Equations

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

Last Lecture. Biostatistics Statistical Inference Lecture 19 Likelihood Ratio Test. Example of Hypothesis Testing.

Approximation of distance between locations on earth given by latitude and longitude

Finite Field Problems: Solutions

Second Order Partial Differential Equations

Lecture 2. Soundness and completeness of propositional logic

derivation of the Laplacian from rectangular to spherical coordinates

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

D Alembert s Solution to the Wave Equation

Concrete Mathematics Exercises from 30 September 2016

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

Math221: HW# 1 solutions

PARTIAL NOTES for 6.1 Trigonometric Identities

Partial Differential Equations in Biology The boundary element method. March 26, 2013

Srednicki Chapter 55

Parametrized Surfaces

6. MAXIMUM LIKELIHOOD ESTIMATION

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Reminders: linear functions

forms This gives Remark 1. How to remember the above formulas: Substituting these into the equation we obtain with

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

Coefficient Inequalities for a New Subclass of K-uniformly Convex Functions

Math 446 Homework 3 Solutions. (1). (i): Reverse triangle inequality for metrics: Let (X, d) be a metric space and let x, y, z X.

The challenges of non-stable predicates

5. Choice under Uncertainty

A Two-Sided Laplace Inversion Algorithm with Computable Error Bounds and Its Applications in Financial Engineering

ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016

Homework 8 Model Solution Section

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

Lecture 21: Properties and robustness of LSE

Lecture 34 Bootstrap confidence intervals

Tridiagonal matrices. Gérard MEURANT. October, 2008

1 String with massive end-points

Aquinas College. Edexcel Mathematical formulae and statistics tables DO NOT WRITE ON THIS BOOKLET

SOME PROPERTIES OF FUZZY REAL NUMBERS

Some new generalized topologies via hereditary classes. Key Words:hereditary generalized topological space, A κ(h,µ)-sets, κµ -topology.

Notes on the Open Economy

Lecture 13 - Root Space Decomposition II

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

w o = R 1 p. (1) R = p =. = 1

ENGR 691/692 Section 66 (Fall 06): Machine Learning Assigned: August 30 Homework 1: Bayesian Decision Theory (solutions) Due: September 13

1. Introduction and Preliminaries.

2. Let H 1 and H 2 be Hilbert spaces and let T : H 1 H 2 be a bounded linear operator. Prove that [T (H 1 )] = N (T ). (6p)

Quadratic Expressions

Depth versus Rigidity in the Design of International Trade Agreements. Leslie Johns

Section 7.6 Double and Half Angle Formulas

ORDINAL ARITHMETIC JULIAN J. SCHLÖDER

Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Statistics 104: Quantitative Methods for Economics Formula and Theorem Review

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

Practice Exam 2. Conceptual Questions. 1. State a Basic identity and then verify it. (a) Identity: Solution: One identity is csc(θ) = 1

( y) Partial Differential Equations

MINIMAL CLOSED SETS AND MAXIMAL CLOSED SETS

Congruence Classes of Invertible Matrices of Order 3 over F 2

F A S C I C U L I M A T H E M A T I C I

Math 6 SL Probability Distributions Practice Test Mark Scheme

CE 530 Molecular Simulation

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

Transcript:

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING A statistical hypothesis test is a decision rule that takes as an input observed sample data and returns an action relating to two mutually exclusive hypotheses that reflect two competing hypothetical states of nature. The decision rule partitions the sample space X into two regions that respectively reflect support for the two hypotheses. The following terminology is used: Two hypotheses characterize the two possible states of nature. The null hypothesis is denoted H 0, the alternative hypothesis is denoted H 1. In parametric models, the null and alternative hypotheses define a partition of the (effective) parameter space Θ. Suppose that disjoint subsets Θ 0, Θ 1 correspond to H 0 and H 1 respectively. We write H 0 : θ Θ 0 H 1 : θ Θ 1 A test, T, of H 0 versus H 1 defines a partition of sample space X into two regions. The hypothesis H 0 is rejected in favour of H 1 in the test depending on where the data x (or a suitably chosen statistic T (x )) fall within X. A test statistic, T (x ), is the function of data x used in a statistical hypothesis test. The critical region, R, is the region within which T (x ) must lie in order for hypothesis H 0 to be rejected in favour of H 1 The complement of R will be written R. A test function, φ R (T (x )), is an indicator function that reports the result of the test, { 1 T (x ) R φ R (T (x )) = 0 T (x ) R A Type I error occurs when the null hypothesis H 0 is rejected when it is in fact true. A Type II error occurs when the null hypothesis H 0 is accepted when it is in fact false. For test with test statistic T and critical region R X, and θ Θ 0, define the Type I error probability ξ(θ) by ξ(θ) = Pr [T R θ] θ Θ 0 (1) If Θ 0 comprises a single value, then The size of a statistical test is ξ = Pr [T R θ = θ 0 ] α = sup θ Θ 0 ξ(θ) which is equal to ξ if Θ 0 comprises a single value. Suppose α α. If T (x ) R, then H 0 is rejected at level α, and rejected at level α + ɛ for ɛ > 0. The power function, β(θ), is defined by β(θ) = Pr [T R θ] θ Θ so that β(θ) = ξ(θ) for θ Θ 0. Note that this notation is not universally used; commonly the power of a statistical test is denoted 1 β(θ) and computed for θ Θ 1, whereas the Type II error probability is β(θ) for θ Θ 1. 1

Most Powerful Tests: The Neyman-Pearson Lemma To construct and assess the quality of a statistical test, we consider the power function β(θ). Consider a family tests C for testing H 0 and H 1 with corresponding subsets Θ 0 and Θ 1. The uniformly most powerful (UMP) test T is the test whose power function β(θ) dominates the power function, β (θ), of any other test T C at all θ Θ 1, A test with power function β(θ) is unbiased if β(θ) β (θ) θ Θ 1. β(θ 1 ) β(θ 0 ) for all θ 0 Θ 0, θ 1 Θ 1 A simple hypothesis is one which specifies the distribution of the data completely. Consider a parametric model f X θ (x θ) with parameter space Θ = {θ 0, θ 1 }, and the test of Then both H 0 and H 1 are simple hypotheses. H 0 : θ = θ 0 H 1 : θ = θ 1 A parametric model f X θ (x θ) for θ Θ is identifiable if f X θ (x θ 0 ) = f X θ (x θ 1 ) for all x R θ 0 = θ 1. Theorem (The Neyman-Pearson Lemma) Consider a parametric model f X θ (x θ) with parameter space Θ = {θ 0, θ 1 }. A test of H 0 : θ = θ 0 H 1 : θ = θ 1 is required. Consider a test T with rejection region R that satisfies f X θ(x θ 1 ) > kf X θ(x θ 0 ) = x R f X θ(x θ 1 ) < kf X θ(x θ 0 ) = x R for some k 0, and Pr[X R θ = θ 0 ] = α. Then T is UMP in the class, C α, of tests at level α. Further, if such a test exists with k > 0, then all tests at level α also have size α (that is, α is the least upper bound of the power function β(θ)), and have rejection region identical to that of T, except perhaps if x A and Pr[X A θ = θ 0 ] = Pr[X A θ = θ 1 ] = 0. Proof As Pr[X R θ = θ 0 ] = α, the test T has size and level α. Consider the test function φ R (x ) for this test, and φ R (x ) be the test function for any other α level test, T. Denote by β(θ) and β (θ) be the power functions for these two tests. Now g(x ) = (φ R (x ) φ R (x ))(f X θ(x θ 1 ) kf X θ(x θ 0 )) 0 as R R x = φ R (x ) = φ R (x ) = 1 g(x ) = 0 R R x = φ R (x ) = 1, φ R (x ) = 0, f X θ(x θ 1 ) > kf X θ(x θ 0 ) g(x ) > 0 R x R = φ R (x ) = 0, φ R (x ) = 1, f X θ(x θ 1 ) < kf X θ(x θ 0 ) g(x ) > 0 R x R = φ R (x ) = φ R (x ) = 0 g(x ) = 0. 2

Thus X (φ R (x ) φ R (x ))(f X θ(x θ 1 ) kf X θ(x θ 0 )) dx 0 but this inequality can be written in terms of the power functions as (β(θ 1 ) β (θ 1 )) k(β(θ 0 ) β (θ 0 )) 0 (2) As β(θ) and β (θ) are bounded above by α, and β(θ 0 ) = α as T is a size α, we have that β(θ 0 ) β (θ 0 ) = α β (θ 0 ) 0 β(θ 1 ) β (θ 1 ) 0 Thus β(θ 1 ) β (θ 1 ), and hence T is UMP, as θ 1 is the only point in Θ 1, and the test with power function β is arbitrarily chosen. Now consider any UMP test T C α. By the result above, T is UMP at level α, so β(θ 1 ) = β (θ 1 ). In this case, if k > 0, we have from equation (2) that β(θ 0 ) β (θ 0 ) = α β (θ 0 ) 0. But, by assumption, T is a level α test, so we also have α β (θ 0 ) 0 and hence β (θ 0 ) = α, that is, T is also a size α test. Therefore (φ R (x ) φ R (x ))(f X θ(x θ 1 ) kf X θ(x θ 0 )) = 0 (3) dx X where the integrand in equation (3) is a non-negative function. Let A be the collection of sets of probability (that is, density) zero under both f X θ(x θ 0 ) and f X θ(x θ 1 ), then A (φ R (x ) φ R (x ))(f X θ(x θ 1 ) kf X θ(x θ 0 )) dx = 0 A A irrespective of the nature of R, so the functions φ R (x ) and φ R (x ) may not be equal for x in such a set A. Apart from that specific case, the integral in equation (3) can only be zero if at least one of the two factors is identically zero for all x. The second factor cannot be identically zero for all x, as the densities must integrate to one. Thus, for all x X \ A and hence R, satisfies the same conditions as R. φ R (x ) = φ R (x ), To evaluate the value of constant k that appears in the Theorem, we need to compute Pr [ X R θ 0 ] for a fixed level/size α. It is possible that, for given alternative hypotheses, no UMP test exists. Also, for discrete data, it may not be possible to solve the equation Pr [ X R θ 0 ] = α for every value of α, and hence only specific values of α may be attained. The test can be reformulated in terms of the statistic λ(x ) where λ(x ) = f X θ(x θ 1 ) f X θ(x θ 0 ) where x R λ(x ) R λ, where R λ {t R + : t > k} 3

If T (X ) is a sufficient statistic for θ, then by the Neyman factorization theorem so that f X θ(x θ 1 ) f X θ(x θ 0 ) = g(t (x ) θ 1)h(x ) g(t (x ) θ 0 )h(x ) = g(t (x ) θ 1) g(t (x ) θ 0 ) λ(x ) R λ T (x ) R T say. Thus any test based on T (x ) with critical region R T is a UMP α level test, and α = Pr[T (X ) R T θ 0 ] Composite Null Hypotheses Often the null and alternative hypotheses do not specify the distribution of the data completely. For example, the specification H 0 : θ = θ 0 H 1 : θ θ 0 could be of interest. If, in general, a UMP test of size α is required, then its power must equal the power of the most powerful test of for all θ 1 Θ 1. H 0 : θ = θ 0 H 1 : θ = θ 1 For one class of models, finding UMP tests for composite hypotheses is possible in general. A parametric family F of probability models indexed by parameter θ Θ has a monotone likelihood ratio if for θ 2 > θ 1, and for x in the union of the supports of the two densities f X θ (x θ 1 ) and f X θ (x θ 2 ), is a monotone function of x. λ(x) = f X θ(x θ 2 ) f X θ (x θ 1 ) Theorem (Karlin-Rubin Theorem) Suppose that a test of the hypotheses H 0 : θ θ 0 H 1 : θ > θ 0 is required. Suppose that T (X ) is a sufficient statistic for θ, and that f T θ for θ Θ has a monotone non-decreasing likelihood ratio, that is for θ 2 θ 1 and t 2 t 1 f T θ (t 2 θ 2 ) f T θ (t 2 θ 1 ) f T θ(t 1 θ 2 ) f T θ (t 1 θ 1 ). Then for any t 0, the test T with critical region R T defined by T (x ) > t 0 = T (x ) R T T (x ) t 0 = T (x ) R T is a UMP α level test, where α = Pr[T > t 0 θ 0 ]. 4

Proof Let β(θ) be the power function of T. Now, for t 2 t 1, f T θ (t 2 θ 2 ) f T θ (t 2 θ 1 ) f T θ(t 1 θ 2 ) f T θ (t 1 θ 1 ) f T θ (t 1 θ 1 )f T θ (t 2 θ 2 ) f T θ (t 1 θ 2 )f T θ (t 2 θ 1 ) (4) Integrating both sides with respect to t 1 on (, t 2 ), we obtain F T θ (t 2 θ 1 )f T θ (t 2 θ 2 ) F T θ (t 2 θ 2 )f T θ (t 2 θ 1 ) f T θ (t 2 θ 2 ) f T θ (t 2 θ 1 ) F T θ(t 2 θ 2 ) F T θ (t 2 θ 1 ). Alternatively, integrating both sides of equation (4) with respect to t 2 on (t 1, ), we similarly obtain f T θ (t 1 θ 2 ) f T θ (t 1 θ 1 ) 1 F T θ(t 1 θ 2 ) 1 F T θ (t 1 θ 1 ) But setting t 1 = t 2 = t in these two inequalities yields which, on rearrangement yields 1 F T θ (t θ 2 ) 1 F T θ (t θ 1 ) F T θ(t θ 2 ) F T θ (t θ 1 ) 1 F T θ (t θ 2 ) F T θ (t θ 2 ) 1 F T θ(t θ 1 ) F T θ (t θ 1 ) F T θ (t θ 2 ) F T θ (t θ 1 ) (5) as F T θ (t θ) is non-decreasing in t, and the function g(x) = (1 x)/x is non-increasing for 0 < x < 1. Finally, β(θ 2 ) β(θ 1 ) = Pr[T > t 0 θ 2 ] Pr[T > t 0 θ 1 ] = (1 F T θ (t θ 2 )) (1 F T θ (t θ 1 )) = F T θ (t θ 1 ) F T θ (t θ 2 ) 0 so β(θ) is non-decreasing in θ. Hence sup θ θ 0 β(θ) = β(θ 0 ) = Pr[T > t 0 θ 0 ] = α so T is an α level test. Now, let θ > θ 0, and consider the simple hypotheses H0 : θ = θ 0 H1 : θ = θ. Let k be defined by k = inf t T 0 f T θ (t θ ) f T θ (t θ 0 ) where T 0 = {t : t > t 0, and f T θ (t θ ) > 0 or f T θ (t θ 0 ) > 0}. Then T > t 0 f T θ(t θ ) f T θ (t θ 0 ) > k so that, by the Neyman-Pearson Lemma, T is UMP for testing H 0 versus H 1 ; for any other test T of H 0 at level α with power function β that satisfies β (θ 0 ) α, we have that β(θ ) β (θ ). But for any α level test T of H 0, we have β (θ 0 ) α. Thus taking T T, we can conclude that β(θ ) β (θ ). This inequality holds for all θ Θ 1, so T must be UMP at level α. 5

The Likelihood Ratio Test The Likelihood Ratio Test (LRT) statistic for testing H 0 against H 1 is based on the statistic λ X (x ) = H 0 : θ Θ 0 H 1 : θ Θ 1 θ Θ 0 sup f = L( θ 0 x ) X θ(x θ) L( θ 1 x ) θ Θ 1 and H 0 is rejected if λ X (x ) is small enough, that is, λ X (x ) k for some k to be defined. Theorem If T (X ) is a sufficient statistic for θ, then λ X (x ) = λ T (T (x )) = sup f T θ (T (x ) θ) θ Θ 0 sup f T θ (T (x ) θ) θ Θ 1 x X Proof As T (X ) is sufficient, for any θ 0, θ 1, L(θ 0 x ) L(θ 1 x ) = f θ 0 ) X θ(x f θ 1 ) = g(t (x ) θ 0)h(x ) g(t (x ); θ 1 )h(x ) = g(t (x ) θ 0) g(t (x ) θ 1 ) = f T θ(t (x ) θ 0 ) f T θ (T (x ) θ 1 ) X θ(x by the Neyman factorization theorem, where the last equality follows as the normalizing constants in numerator and denominator are identical. Hence, at the suprema, the LRT statistics are equal. Union and Intersection Tests Suppose first that we require a test T for the null hypothesis expressed as H 0 : θ Θ 0 where Θ γ, γ G are a collection of subsets of Θ. Suppose that T γ is a test for the hypotheses H 0γ : θ Θ γ H 1γ : θ Θ γ with test statistic T γ ) and critical region R γ. Then the rejection region for T is (X R G R γ = T rejects H 0 if x {T γ (x ) R γ } that is, if any one of the T γ rejects H 0γ. This test is termed a Union-Intersection Test (UIT). Suppose now that we require a test T for the null hypothesis expressed as H 0 : θ Θ 0 Then, by the same logic as above, the rejection region for T is R G R γ = T rejects H 0 if x {T γ (x ) R γ } 6 Θ γ Θ γ

that is, if all of the T γ reject H 0γ. This test is termed an Intersection-Union Test (IUT). Note that if α γ is the size of the test of H 0γ, then the IUT is a level α test, where as, for each γ and for any θ Θ 0, α = sup α γ α α γ = Pr[X R γ θ] Pr[X R θ] Theorem Consider testing H 0 : θ Θ 0 Θ γ using the global likelihood ratio statistic H 1 : θ Θ 0 λ(x ) = θ Θ 0 θ Θ 1 equipped with the usual critical region R : λ(x ) < c}, and the collection of likelihood ratio statistics λ γ (x ) {x θ Θ 0γ λ γ (x ) = θ Θ 1γ Define statistic T (x ) = inf λ γ(x ), and consider the critical region Then (a) T (x ) λ(x ) for all x. R G {x : λ γ (x ) < c, some γ G} {x : T (x ) < c}, (b) If β T and β λ are the power functions for the tests based on T (X ) and λ(x ) respectively, then β T (θ) β λ (θ) for all θ Θ (c) If the test based on λ(x ) is an α level test, then the test based on T (X ) is also an α level test. Proof For (a), as Θ 0 Θ γ, we have and thus for (b), for any θ, λ γ (x ) λ(x ) for each γ T (x ) = inf λ γ(x ) λ(x ) β T (θ) = Pr[T (X ) < c θ] Pr[λ(X ) < c θ] = β λ (θ). Hence which proves (c). sup β T (θ) sup β λ (θ) α θ Θ 0 θ Θ 0 7

P-values Consider a test of hypothesis H 0 defined by region Θ 0 of the parameter space. A p-value, p(x ), is a test statistic such that 0 p(x ) 1 for each x. A p-value is valid if, for every θ Θ 0 and 0 α 1 Pr[p(X ) α θ] α. That is, a valid p-value is a test statistic that produces a test at level α of the form p(x ) α = x R p(x ) > α = x R The most common construction of a valid p-value is given by the following theorem. Theorem Suppose that T (X ) is a test statistic constructed so that a large value of T (X ) supports H 1. Then the statistic p(x ) given for each x X by say, is a valid p-value. p(x ) = sup θ Θ 0 Pr[T (X ) T (x ) θ] = sup θ Θ 0 p θ (X ) (6) Proof For θ Θ 0, we have p θ (x ) = Pr[T (X ) T (x ) θ] = Pr[ T (X ) T (x ) θ] = F θ ( T (x )) F S (s) say, defining F S F θ as the cdf of S = T (X ); clearly 0 p(x ) 1. This recalls a result from distribution theory; if X F X, the U = F X (X) Uniform(0, 1). Suppressing the dependence on θ for convenience, define random variable Y by Y = F θ ( T (X )) F S (S) (= p θ (X )) and let A y {s : F S (s) y}. If A y is a half-closed interval (, s y ], then F Y (y) = Pr[Y y] = Pr[F S (S) y] = Pr[S A y ] = F S (s y ) y by definition of A y, as s y A y. If A y is a half-open interval (, s y ) F Y (y) = Pr[Y y] = Pr[F S (S) y] = Pr[S A y ] = by continuity of probability. Putting the components together, for 0 α 1, Pr[p θ (X ) α θ] Pr[Y α] α But by the definition in equation (6), p(x ) p θ (x ), so and the result follows. Pr[p(X ) α θ] Pr[p θ (X ) α θ] α lim F S (s) y s s y 8