Lecture 34 Bootstrap confidence intervals

Σχετικά έγγραφα
Other Test Constructions: Likelihood Ratio & Bayes Tests

ST5224: Advanced Statistical Theory II

Homework 3 Solutions

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

2 Composition. Invertible Mappings

Solution Series 9. i=1 x i and i=1 x i.

Statistical Inference I Locally most powerful tests

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

C.S. 430 Assignment 6, Sample Solutions

HISTOGRAMS AND PERCENTILES What is the 25 th percentile of a histogram? What is the 50 th percentile for the cigarette histogram?

5.4 The Poisson Distribution.

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

EE512: Error Control Coding

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Lecture 21: Properties and robustness of LSE

ORDINAL ARITHMETIC JULIAN J. SCHLÖDER

Pg The perimeter is P = 3x The area of a triangle is. where b is the base, h is the height. In our case b = x, then the area is

Inverse trigonometric functions & General Solution of Trigonometric Equations

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

SOLUTIONS TO MATH38181 EXTREME VALUES AND FINANCIAL RISK EXAM

Theorem 8 Let φ be the most powerful size α test of H

Απόκριση σε Μοναδιαία Ωστική Δύναμη (Unit Impulse) Απόκριση σε Δυνάμεις Αυθαίρετα Μεταβαλλόμενες με το Χρόνο. Απόστολος Σ.

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Math 446 Homework 3 Solutions. (1). (i): Reverse triangle inequality for metrics: Let (X, d) be a metric space and let x, y, z X.

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Example Sheet 3 Solutions

k A = [k, k]( )[a 1, a 2 ] = [ka 1,ka 2 ] 4For the division of two intervals of confidence in R +

Problem Set 3: Solutions

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

Section 8.3 Trigonometric Equations

Exercise 2: The form of the generalized likelihood ratio

Μηχανική Μάθηση Hypothesis Testing

Every set of first-order formulas is equivalent to an independent set

Congruence Classes of Invertible Matrices of Order 3 over F 2

4.6 Autoregressive Moving Average Model ARMA(1,1)

derivation of the Laplacian from rectangular to spherical coordinates

ω ω ω ω ω ω+2 ω ω+2 + ω ω ω ω+2 + ω ω+1 ω ω+2 2 ω ω ω ω ω ω ω ω+1 ω ω2 ω ω2 + ω ω ω2 + ω ω ω ω2 + ω ω+1 ω ω2 + ω ω+1 + ω ω ω ω2 + ω

Partial Differential Equations in Biology The boundary element method. March 26, 2013

Lecture 2. Soundness and completeness of propositional logic

6. MAXIMUM LIKELIHOOD ESTIMATION

Section 7.6 Double and Half Angle Formulas

Second Order RLC Filters

Concrete Mathematics Exercises from 30 September 2016

Math 6 SL Probability Distributions Practice Test Mark Scheme

Figure A.2: MPC and MPCP Age Profiles (estimating ρ, ρ = 2, φ = 0.03)..

Uniform Convergence of Fourier Series Michael Taylor

6.3 Forecasting ARMA processes

Approximation of distance between locations on earth given by latitude and longitude

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

An Inventory of Continuous Distributions

The challenges of non-stable predicates

5. Choice under Uncertainty

The Simply Typed Lambda Calculus

ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ

Finite Field Problems: Solutions

Main source: "Discrete-time systems and computer control" by Α. ΣΚΟΔΡΑΣ ΨΗΦΙΑΚΟΣ ΕΛΕΓΧΟΣ ΔΙΑΛΕΞΗ 4 ΔΙΑΦΑΝΕΙΑ 1

HOMEWORK#1. t E(x) = 1 λ = (b) Find the median lifetime of a randomly selected light bulb. Answer:

Homework 8 Model Solution Section

Matrices and Determinants

Chapter 6: Systems of Linear Differential. be continuous functions on the interval

Fourier Series. MATH 211, Calculus II. J. Robert Buchanan. Spring Department of Mathematics

forms This gives Remark 1. How to remember the above formulas: Substituting these into the equation we obtain with

Math221: HW# 1 solutions

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

Exercises 10. Find a fundamental matrix of the given system of equations. Also find the fundamental matrix Φ(t) satisfying Φ(0) = I. 1.

Fractional Colorings and Zykov Products of graphs

( y) Partial Differential Equations

ANSWERSHEET (TOPIC = DIFFERENTIAL CALCULUS) COLLECTION #2. h 0 h h 0 h h 0 ( ) g k = g 0 + g 1 + g g 2009 =?

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 19/5/2007

A Note on Intuitionistic Fuzzy. Equivalence Relation

Statistics 104: Quantitative Methods for Economics Formula and Theorem Review

Υπολογιστική Φυσική Στοιχειωδών Σωματιδίων

Second Order Partial Differential Equations

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

t-distribution t a (ν) s N μ = where X s s x = ν 2 FD ν 1 FD a/2 a/2 t-distribution normal distribution for ν>120

Notes on the Open Economy

CHAPTER 48 APPLICATIONS OF MATRICES AND DETERMINANTS

Bounding Nonsplitting Enumeration Degrees

CRASH COURSE IN PRECALCULUS

New bounds for spherical two-distance sets and equiangular lines

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Chapter 3: Ordinal Numbers

DERIVATION OF MILES EQUATION FOR AN APPLIED FORCE Revision C

Areas and Lengths in Polar Coordinates

2. THEORY OF EQUATIONS. PREVIOUS EAMCET Bits.

Durbin-Levinson recursive method

Reminders: linear functions

Areas and Lengths in Polar Coordinates

About these lecture notes. Simply Typed λ-calculus. Types

Section 9.2 Polar Equations and Graphs

Homework for 1/27 Due 2/5

557: MATHEMATICAL STATISTICS II RESULTS FROM CLASSICAL HYPOTHESIS TESTING

Exercises to Statistics of Material Fatigue No. 5

Solutions to Exercise Sheet 5

2. Let H 1 and H 2 be Hilbert spaces and let T : H 1 H 2 be a bounded linear operator. Prove that [T (H 1 )] = N (T ). (6p)

STAT200C: Hypothesis Testing

= λ 1 1 e. = λ 1 =12. has the properties e 1. e 3,V(Y

Lecture 15 - Root System Axiomatics

Forced Pendulum Numerical approach

Transcript:

Lecture 34 Bootstrap confidence intervals

Confidence Intervals θ: an unknown parameter of interest We want to find limits θ and θ such that Gt = P nˆθ θ t If G 1 1 α is known, then P θ θ = P θ θ = 1 α θ = ˆθ G 1 1 α/ n is an exact 1001 α% lower confidence limit for θ Traditional asymptotic approach: Gt Φt/σ, n Φ: the standard normal cdf σ: an unknown scale parameter ˆσ: a consistent estimator of σ, Gˆσt Φt Normal approximation NA 1001 α% confidence limits are θ N = ˆθ ˆσz 1 α / n, θ N = ˆθ + ˆσz 1 α / n z a = Φ 1 a

Bootstrap Confidence Intervals 1 The hybrid bootstrap HB A bootstrap estimator of Gt = P nˆθ θ t is Ĝt = P nˆθ ˆθ t G 1 1 α can be estimated by Ĝ 1 1 α HB lower and upper confidence limits: θ HB = ˆθ Ĝ 1 1 α/ n θ HB = ˆθ Ĝ 1 α/ n If Gt is nearly symmetric, then Ĝ 1 α can be replaced by Ĝ 1 1 α Hybrid: Use bootstrap estimate in the construction of confidence interval based on the sampling distribution of ˆθ θ.

2 Bootstrap-t BT Gt = P nˆθ θ t Φt/σ ˆσ: a consistent estimator of σ Studentized statistic: nˆθ θ/ˆσ Ht = P nˆθ θ/ˆσ t Φt ˆσ : a bootstrap analog of ˆσ A bootstrap estimator of Ht: Ĥt = P nˆθ ˆθ/ˆσ t BT lower and upper confidence limits: θ BT = ˆθ ˆσĤ 1 1 α/ n θ BT = ˆθ + ˆσĤ 1 1 α/ n This is also a hybrid method It is called BT because it is based on the studentized variable nˆθ θ/ˆσ

3 The bootstrap percentile BP Bootstrap distribution histogram Kt = P ˆθ t 1 B 1001 α% BP lower confidence limit # of times ˆθ b t θ BP = K 1 α = inf{t : Kt α} ˆθ B θ BP αbth ordered value of ˆθ 1,..., 1001 α% BP upper confidence limit θ BP = K 1 1 α = inf{t : Kt 1 α} θ BP 1 αbth ordered value of ˆθ 1,..., There are two improvements over the BP ˆθ B

Accuracy: Justification Properties of the bootstrap percentile BP Lower confidence bound level 1 α θ BP = K 1 α, Kt = P ˆθ t Assumption A: There is a monotone transformation φ such that P ˆφ φ t = Ψt for all t and all P φ = φθ, ˆφ = φˆθ Ψ: a continuous, increasing cdf symmetric about 0 Result 1. If φ is known, then θ E = φ 1 ˆφ + z α, z α = Ψ 1 α, is an exact 1001 α% lower confidence bound, i.e., P θ E θ = 1 α

Proof P θ E θ = P φθ E φθ = P φφ 1 ˆφ + z α φ = P ˆφ + zα φ = P ˆφ φ zα = Ψ z α = 1 Ψz α = 1 α

Result 2. Under Assumption A, θ BP = θ E. This means that the BP interval is exactly correct under Assumption A To use the BP, we don t need to know φ The derivation of φ is replaced by computation If Assumption A holds approximately when n is large, then the BP intervals are approximately correct

Proof Let w = φθ BP ˆφ and ˆφ = φˆθ Since Assumption A holds for all P including P, Ψw = P ˆφ ˆφ w = P ˆφ ˆφ φθ BP ˆφ = P φˆθ φθ BP = P ˆθ θ BP = Kθ BP = α By the uniqueness of the quantile, w = Ψ 1 α = z α Then Hence φθ BP = ˆφ + z α θ BP = φ 1 ˆφ + z α = θ E

The bootstrap bias-corrected percentile BC Is Assumption A too strong? Assumption B: There is a monotone transformation φ and a constant z 0 such that P ˆφ φ + z 0 t = Ψt for all t and all P Ψ: a continuous, increasing cdf symmetric about 0 If z 0 = 0, then Assumption B reduces to Assumption A. z 0 is a bias correction. Since Assumption B holds for all P including P, i.e., z 0 = Ψ 1 Kˆθ Kˆθ = P ˆθ ˆθ = P ˆφ ˆφ = P ˆφ ˆφ + z 0 z 0 = Ψz 0

Result 3. θ E = φ 1 ˆφ + z α + z 0 is an exact 1001 α% lower confidence limit for θ under Assumption B Proof: Similar to the proof of result 1. We need to know φ and Ψ in order to use θ E The BC confidence limits are θ BC = K 1 Ψz α + 2z 0 = K 1 Ψz α + 2Ψ 1 Kˆθ θ BC = K 1 Ψz 1 α + 2Ψ 1 Kˆθ When bootstrap Monte Carlo of size B is applied, θ BC BΨz α + 2z 0 th ordered value of ˆθ 1,..., ˆθ B θ BC BΨz 1 α + 2z 0 th ordered value of ˆθ 1,..., ˆθ B BC = BP if Kˆθ = 0.5, since Ψ 1 0.5 = 0 i.e., ˆθ is the median of Kt bootstrap histogram Thus, BC is a bias-corrected BP It shifts the BP bound towards left or right according to z 0 BC does not need the explicit form of φ But BC requires the form of Ψ typically, Ψ = Φ

Result 4. Under Assumption B, θ BC = θ E BC is exactly correct if Assumption B holds. If Assumption B holds approximately, then BC is approximately correct Proof: For any α, 1 α = Ψz 1 α = P ˆφ ˆφ + z 0 z 1 α = P ˆφ ˆφ + Ψ 1 1 α z 0 = P ˆθ φ 1 ˆφ + Ψ 1 1 α z 0 = Kφ 1 ˆφ + Ψ 1 1 α z 0 Hence, for all t, K 1 t = φ 1 ˆφ + Ψ 1 t z 0 Let t = Ψz α + 2z 0, K 1 Ψz α + 2z 0 = φ 1 ˆφ + z α + 2z 0 z 0 = φ 1 ˆφ + z α + z 0 = θ E

The bootstrap accelerated bias-corrected percentile BC a Assumption B is weaker than Assumption A Assumption B is still too strong in some cases Weaker assumptions? Assumption C: There is a monotone transformation φ and constants z 0 and a acceleration constant such that ˆφ φ P 1 + aφ + z 0 t = Ψt for all t and all P Ψ: a continuous, increasing cdf symmetric about 0 If a = 0, Assumption C reduces to Assumption B. a corrects skewness

Result 5. Under Assumption C, θ E = φ 1 ˆφ + z α + z 0 1 + a ˆφ 1 az α + z 0 is an exact 1001 α% lower confidence limit for θ Proof: 1 α = Ψ z α ˆφ φ = P 1 + aφ + z 0 z α = P ˆφ φ zα + z 0 1 + aφ ˆφ + zα + z 0 = P 1 az α + z 0 φ = P ˆφ + z α + z 0 1 + a ˆφ φ 1 az α + z 0 = P θ E θ

We need to know a, Ψ, and φ in order to use θ E Under Assumption C, z 0 = Ψ 1 Kˆθ First, assume a is known The BC a lower confidence bound is θ BCa = K Ψ 1 z α + z 0 z 0 + 1 az α + z 0 The BC a upper confidence bound is θ BCa = K Ψ 1 z 1 α + z 0 z 0 + 1 az 1 α + z 0 When bootstrap Monte Carlo of size B is applied, θ BCa BΨ z 0 + zα+z 0 1 az α+z 0 th ordered value of ˆθ 1,..., ˆθ B θ BCa BΨ z 0 + z 1 α+z 0 1 az 1 α +z 0 th ordered value of ˆθ 1,..., ˆθ B We don t need the explicit form of φ But we need to know a and Ψ If a = 0, BC a reduces to BC

Result 6. If a is known, then θ BCa = θ E. Hence, BC a is exactly correct if Assumption C holds. If Assumption C holds approximately, then BC a is approximately correct Proof. 1 α = Ψz 1 α ˆφ = P ˆφ 1 + a ˆφ + z 0 z 1 α = P ˆφ ˆφ + z 1 α z 0 1 + a ˆφ ˆθ = P φ 1 ˆφ + z 1 α z 0 1 + a ˆφ = K φ 1 ˆφ + z 1 α z 0 1 + a ˆφ = K φ 1 ˆφ + Ψ 1 1 α z 0 1 + a ˆφ This shows that K 1 t = φ 1 ˆφ + Ψ 1 t z 0 1 + a ˆφ

Let t = Ψ z 0 + zα+z 0 1 az α+z 0, θ E = φ 1 ˆφ + z α + z 0 1 + a ˆφ 1 az α + z 0 = K Ψ 1 z 0 + = θ BCa z α + z 0 1 az α + z 0 The constant a is unknown and has to be estimated â: an estimator of a θ BCâ and θ BCâ are BCâ confidence limits They are also called ABC confidence limits

Asymptotic Accuracy A confidence set C is first order accurate if and second order accurate if P θ C = 1 α + On 1/2 P θ C = 1 α + On 1 For the case of ˆθ is a smooth function of sample means, we have shown the following summary: 1. The BT and bootstrap BC a one-sided confidence intervals are second order accurate. 2. The the BP, BC, HB, and NA one-sided confidence intervals are in general first order accurate. 3. The equal-tail two-sided confidence intervals produced by all five bootstrap methods and the normal approximation are second order accurate errors cancel each other.