Online Appendix for Measuring the Dark Matter in Asset Pricing Models

Σχετικά έγγραφα
Homework for 1/27 Due 2/5

Lecture 17: Minimum Variance Unbiased (MVUB) Estimators

Solutions: Homework 3

Last Lecture. Biostatistics Statistical Inference Lecture 19 Likelihood Ratio Test. Example of Hypothesis Testing.

1. For each of the following power series, find the interval of convergence and the radius of convergence:

Lecture 3: Asymptotic Normality of M-estimators

Other Test Constructions: Likelihood Ratio & Bayes Tests

Three Classical Tests; Wald, LM(Score), and LR tests

Proof of Lemmas Lemma 1 Consider ξ nt = r

Μια εισαγωγή στα Μαθηματικά για Οικονομολόγους

SUPPLEMENT TO ROBUSTNESS, INFINITESIMAL NEIGHBORHOODS, AND MOMENT RESTRICTIONS (Econometrica, Vol. 81, No. 3, May 2013, )


Introduction of Numerical Analysis #03 TAGAMI, Daisuke (IMI, Kyushu University)

SUPERPOSITION, MEASUREMENT, NORMALIZATION, EXPECTATION VALUES. Reading: QM course packet Ch 5 up to 5.6

Diane Hu LDA for Audio Music April 12, 2010

4.6 Autoregressive Moving Average Model ARMA(1,1)

The Heisenberg Uncertainty Principle

n r f ( n-r ) () x g () r () x (1.1) = Σ g() x = Σ n f < -n+ r> g () r -n + r dx r dx n + ( -n,m) dx -n n+1 1 -n -1 + ( -n,n+1)

ST5224: Advanced Statistical Theory II

Statistical Inference I Locally most powerful tests

Supplemental Material: Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction

true value θ. Fisher information is meaningful for families of distribution which are regular: W (x) f(x θ)dx

1. Matrix Algebra and Linear Economic Models

Outline. Detection Theory. Background. Background (Cont.)

LAD Estimation for Time Series Models With Finite and Infinite Variance

Ψηφιακή Επεξεργασία Εικόνας

Biorthogonal Wavelets and Filter Banks via PFFS. Multiresolution Analysis (MRA) subspaces V j, and wavelet subspaces W j. f X n f, τ n φ τ n φ.

Adaptive Covariance Estimation with model selection

IIT JEE (2013) (Trigonomtery 1) Solutions

A study on generalized absolute summability factors for a triangular matrix

p n r

On Generating Relations of Some Triple. Hypergeometric Functions

Every set of first-order formulas is equivalent to an independent set

Probability theory STATISTICAL MODELING OF MULTIVARIATE EXTREMES, FMSN15/MASM23 TABLE OF FORMULÆ. Basic probability theory

Bessel function for complex variable

INTEGRATION OF THE NORMAL DISTRIBUTION CURVE

Uniform Convergence of Fourier Series Michael Taylor

Uniform Estimates for Distributions of the Sum of i.i.d. Random Variables with Fat Tail in the Threshold Case

Supplement to A theoretical framework for Bayesian nonparametric regression: random series and rates of contraction

Degenerate Perturbation Theory

The Equivalence Theorem in Optimal Design

Parameter Estimation Fitting Probability Distributions Bayesian Approach

Solution Series 9. i=1 x i and i=1 x i.

Binet Type Formula For The Sequence of Tetranacci Numbers by Alternate Methods

Endogoneity and All That

Factorial. Notations. Specific values. Traditional name. Traditional notation. Mathematica StandardForm notation. Specialized values

Αναγνώριση Προτύπων. Non Parametric

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

On Certain Subclass of λ-bazilevič Functions of Type α + iµ

Lecture 21: Properties and robustness of LSE

Partial Differential Equations in Biology The boundary element method. March 26, 2013

Homework 4.1 Solutions Math 5110/6830

FREE VIBRATION OF A SINGLE-DEGREE-OF-FREEDOM SYSTEM Revision B

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

2 Composition. Invertible Mappings

CHAPTER 103 EVEN AND ODD FUNCTIONS AND HALF-RANGE FOURIER SERIES

Μηχανική Μάθηση Hypothesis Testing

Higher Order Properties of Bootstrap and Jackknife Bias Corrected Maximum Likelihood Estimators

The Simply Typed Lambda Calculus

α β

Στα επόμενα θεωρούμε ότι όλα συμβαίνουν σε ένα χώρο πιθανότητας ( Ω,,P) Modes of convergence: Οι τρόποι σύγκλισης μιας ακολουθίας τ.μ.

C.S. 430 Assignment 6, Sample Solutions

Supplemental Material to Comparison of inferential methods in partially identified models in terms of error in coverage probability

Lecture 34 Bootstrap confidence intervals

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutions to Problems on Matrix Algebra

COMMON RANDOM FIXED POINT THEOREMS IN SYMMETRIC SPACES

Second-order asymptotic comparison of the MLE and MCLE of a natural parameter for a truncated exponential family of distributions

Inertial Navigation Mechanization and Error Equations

Research Article Finite-Step Relaxed Hybrid Steepest-Descent Methods for Variational Inequalities

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

Fractional Colorings and Zykov Products of graphs

A Note on Intuitionistic Fuzzy. Equivalence Relation

Outline. M/M/1 Queue (infinite buffer) M/M/1/N (finite buffer) Networks of M/M/1 Queues M/G/1 Priority Queue

Matrices and Determinants

Lecture 12: Pseudo likelihood approach

Problem Set 3: Solutions

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

On Inclusion Relation of Absolute Summability

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Homework 3 Solutions

6.3 Forecasting ARMA processes

EE512: Error Control Coding

Supplementary Materials: Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent

Dimension-free PAC-Bayesian bounds for matrices, vectors, and linear least squares regression.

Approximation of distance between locations on earth given by latitude and longitude

1. Introduction. Main Result. 1. It is well known from Donsker and Varadhan [1], the LDP for occupation measures. I(ξ k A) (1.

EE 570: Location and Navigation

Variance Covariance Matrices for Linear Regression with Errors in both Variables. by J.W. Gillard and T.C. Iles

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

Concrete Mathematics Exercises from 30 September 2016

A New Class of Analytic p-valent Functions with Negative Coefficients and Fractional Calculus Operators

Presentation of complex number in Cartesian and polar coordinate system

Theorem 8 Let φ be the most powerful size α test of H

Example Sheet 3 Solutions

= λ 1 1 e. = λ 1 =12. has the properties e 1. e 3,V(Y

On Second-Order Error Terms In Fully Exponential Laplace Approximations.

Areas and Lengths in Polar Coordinates

w o = R 1 p. (1) R = p =. = 1

Transcript:

Olie Appedix for Measurig the Dark Matter i Asset Pricig Models Hui Che Wisto Wei Dou Leoid Koga April 15, 017 Cotets 1 Iformatio-Theoretic Iterpretatio for Model Fragility 1.1 Bayesia Aalysis with Limited-Iformatio Likelihoods................ 5 1. The Effective-Sample Size................................. 8 1.3 Geeric Notatios ad Defiitios............................ 13 1.4 Regularity Coditios for Theoretical Results...................... 14 1.5 Lemmas........................................... 18 1.6 Basic Properties of Limited-Iformatio Likelihoods.................. 3 1.7 Iformatio Matrices of Limited-Iformatio Likelihoods................ 9 1.8 Properties of Posteriors Based o Limited-Iformatio Likelihoods.......... 34 1.9 Proof of Theorem 1..................................... 43 1.10 Proof of Theorem..................................... 43 1.11 Proof of Theorem 3..................................... 54 Disaster risk model 70.1 The Euler Equatio.................................... 70. Fisher fragility measure.................................. 71.3 Posteriors.......................................... 7.4 ABC Method ad Implemetatio............................ 73.5 Results............................................ 75 3 Log-ru Risk Model: Solutios ad Momet Coditios 76 3.1 The Model Solutio.................................... 76 Che: MIT Sloa ad NBER (huiche@mit.edu). Dou: Wharto (wdou@wharto.upe.edu). Koga: MIT Sloa ad NBER (lkoga@mit.edu). 1

3. Calibratios: Simulated ad Empirical Momets.................... 80 3.3 Geeralized Methods of Momets............................. 8 4 Iformatio-Theoretic Iterpretatio for Model Fragility Based o Cheroff Rates 84 1 Iformatio-Theoretic Iterpretatio for Model Fragility I Che, Dou, ad Koga (017), we provide a ecoometric justificatio for our Fisher fragility measure: it essetially quatifies the over-fittig tedecy of the fuctioal-form specificatios for certai structural compoets of a model. I this sectio, we formalize the ituitio that excessive iformativeess of cross-equatio restrictios is fudametally associated with model fragility. More precisely, we formally show that structural ecoomic models are fragile whe the cross-equatio restrictios appear excessively iformative about certai combiatios of model parameters that are otherwise difficult to estimate (we refer to such parameter combiatios as dark matter ). To do so, we itroduce a measure of iformativeess of the cross-equatio restrictios, ad express it i terms of the effective sample size. This iformatioal measure of cross-equatio restrictios is iterpretable i fiite samples ad provides ecoomic ituitio for the Fisher fragility measure. We develop our aalysis i a Bayesia framework. Startig with a prior o (θ, ψ), deoted by π(θ, ψ), we obtai posterior distributios through the baselie model ad the structural model, respectively. The, the discrepacy betwee the two posteriors shows how the cross-equatio restrictios affect the iferece about θ. We assume that the stochastic process x t is strictly statioary ad ergodic with a statioary distributio P. The true joit distributio for x (x 1,, x ) is P. Similarly, we assume that the joit stochastic process x t, y t is strictly statioary ad ergodic with a statioary distributio Q. The ecoometricia does ot eed to specify the full fuctioal form of the joit distributio of (x, y ) (x t, y t ) : t = 1,,, which we deote by Q. The ukow joit desity is q(x, y ). We evaluate the performace of a structural model uder the Geeralized Method of Momets (GMM) framework. The semial paper by Hase ad Sigleto (198) pioeers the literature of applyig GMM to evaluate ratioal expectatio asset pricig models. Specifically, we assume that the model builder is cocered with the model s i-sample ad out-of-sample performaces as represeted by a set of momet coditios, 1 based o a D Q 1 vector of fuctios g Q (θ, ψ; x, y) of 1 We ca also adopt the CUE method of Hase, Heato, ad Yaro (1996) or its modificatio Hausma, Lewis, Mezel, ad Newey (011) s RCUE method, or some other extesio of GMM with the same first-order efficiecy ad possibly superior higher-order asymptotic properties. This will lead to alterative but coceptually similar measures of overfittig. To simplify the compariso with the Fisher fragility measure, we chose to use the origial GMM framework.

data observatios (x t, y t ) ad the parameter vectors θ ad ψ satisfyig the followig coditios: E g Q (θ 0, ψ 0 ; x t, y t )] = 0. (1) The baselie momet fuctios g P (θ; x t ) characterize the momet coditios of the baselie model. They costitute the first D P elemets of the whole vector of momet fuctios g Q (θ, ψ; x t, y t ). Thus, the baselie momets ca be represeted by the full set of momets weighted by a special matrix: g P (θ; x t ) = Γ P g Q (θ, ψ; x t, y t ) where Γ P I DP, O DP (D Q D P )]. () The momet fuctios g P (θ; x t ) deped oly o parameters θ, sice all parameters of the baselie model are icluded i θ. Accordigly, the momet coditios for the baselie model is E g P (θ 0 ; x t )] = 0. (3) Deote the empirical momet coditios for the full model ad the baselie model by ĝ Q, (θ, ψ) 1 g Q (θ, ψ; x t, y t ) ad ĝ P, (θ) 1 g P (θ; x t ), respectively. The, the optimal GMM estimator (ˆθ Q, ˆψ Q ) of the full model ad that of the baselie model ˆθ P miimize, respectively, Ĵ,SQ (θ, ψ) ĝ Q, (θ, ψ) T S 1 Q ĝq,(θ, ψ) ad Ĵ,S P (θ) ĝ P, (θ) T S 1 P ĝp,(θ). (4) Here, Ĵ,SQ (θ, ψ) ad Ĵ,S P (θ) are ofte referred to as the J-distaces, ad S Q ad S P have the followig explicit formulae (see Hase, 198), S Q S P + l= + l= E g Q (θ 0, ψ 0 ; x t, y t )g Q (θ 0, ψ 0 ; x t l, y t l ) T ], ad (5) E g P (θ 0 ; x t )g P (θ 0 ; x t l ) T ], respectively. (6) The matrix S Q ad S P are the covariace matrices of the momet coditios at the true parameter values. I practice, whe S Q or S P is ukow, we ca replace it with a cosistet estimator ŜQ, or ŜP,, respectively. The cosistet estimators of the covariace matrices are provided by Newey ad West (1987), Adrews (1991), ad Adrews ad Moaha (199). We use GMM to evaluate model performace because of the cocer of likelihood mis-specificatio. The GMM approach gives the model builder flexibility to choose which aspects of the model to emphasize whe estimatig model parameters ad evaluatig model specificatios. This is i cotrast to the likelihood approach, which relies o the full probability distributio implied by the 3

structural model. Fially, we itroduce some further otatio. We deote the GMM Fisher iformatio matrix for the baselie model as I P (θ) (see Hase, 198; Hah, Newey, ad Smith, 011), ad I P (θ) G P (θ) T S 1 P G P(θ), (7) where G P (θ) E g P (θ; x t )],ad for brevity, we deote G P G P (θ 0 ). We deote the aalog for the structural model as I Q (θ, ψ), I Q (θ, ψ) G Q (θ, ψ) T S 1 Q G Q(θ, ψ), (8) where G Q (θ, ψ) E g Q (θ, ψ; x t, y t )], ad for brevity, we deote G Q G Q (θ 0, ψ 0 ). Computig the expectatio G Q (θ) ad G Q (θ, ψ) requires kowig the distributio Q. I cases whe Q is ukow, G P (θ) i (7) ad G Q (θ, ψ) i (8) ca be replaced by their cosistet estimators g P (θ; x t ) ad g Q (θ, ψ; x t, y t ). For the full model Q, we will focus o its implied Fisher iformatio matrix I Q (θ ψ): I Q (θ ψ) Γ Θ I Q (θ, ψ) 1 Γ T Θ] 1, where ΓΘ I DΘ, O DΘ D Ψ ]. (9) More precisely, the Fisher iformatio matrix I Q (θ, ψ) ca be partitioed ito a two-by-two block matrix accordig to θ ad ψ: I Q (θ, ψ) = I (1,1) Q (θ, ψ), I (1,) Q (θ, ψ) I (,1) Q (θ, ψ), I (,) Q (θ, ψ) ], (10) where I (1,1) Q (θ, ψ) is the D Θ D Θ iformatio matrix correspodig to baselie parameters θ, I (,) Q (θ, ψ) is the D Ψ D Ψ iformatio matrix correspodig to uisace parameters ψ, ad I (1,) Q (θ, ψ) = I (,1) Q (θ, ψ) T is the D Θ D Ψ cross-iformatio matrix correspodig to θ ad ψ. The I Q (θ ψ) ca be writte as I Q (θ ψ) = I (1,1) Q (θ, ψ) I (1,) Q (θ, ψ)i (,) Q (θ, ψ) 1 I (1,) Q (θ, ψ) T, (11) which geerally is ot equal to the Fisher iformatio sub-matrix I (1,1) Q (θ, ψ) for baselie parameters θ, except the special case i which I (1,) Q (θ, ψ) = 0, i.e. the kowledge of θ ad that of ψ are ot iformative about each other. We assume that the iformatio matrices are osigular i this paper (Assumptio A4 i Appedix 1.4). We first itroduce a momet-based Bayesia method with limited-iformatio likelihoods i Subsectio 1.1. The, we itroduce the defiitio of effective sample size to gauge the iformativeess of the cross-equatio restrictios ad state the mai results i Subsectio 1.. Third, we predefie the ecessary special otatios i Subsectio 1.3. Fourth, we itroduce the stadard regularity coditios i Subsectio 1.4. Fifth, we prove the basic lemmas i Subsectio 1.5, which are themselves iterestig ad geeral. Sixth, i Subsectio 1.6, we state ad prove propositios which 4

serve as itermediate steps for the proof of the mai results. Fially, the mai results stated i Subsectio 1. are proved i Subsectio 1.9-1.11. 1.1 Bayesia Aalysis with Limited-Iformatio Likelihoods Whe likelihood-based methods are difficult or ureliable, oe robust tool for the ecoometricia to derive the posteriors of baselie model π P (θ x ) ad full structural model π Q (θ, ψ x, y ) is the limited-iformatio likelihood (LIL). It relies o certai momet coditios used for model evaluatio. Of course, the full likelihood fuctio ca be used whe likelihood methods are possible. Here, restricted to the momet costraits, we embed a likelihood that is closest to the uderlyig true distributio ito the Bayesia paradigm. We use the Kullback-Leibler divergece (also kow as the relative etropy) to gauge the discrepacy betwee probability measures. 3 We first focus o the limited-iformatio likelihood for the full structural model. The largesample results of Kim (00) provide a asymptotic Bayesia iterpretatio of GMM, usig the expoetial quadratic form ad igorig the fiite-sample validity ad a valid parametric family for likelihoods. However, our iformatio-theoretic ratioale to model fragility ideed requires a valid fiite-sample iterpretatio. To achieve this goal, we adopt the framework of Kitamura ad Stutzer (1997), ad we focus o the case that momet fuctios are stable autoregressive processes. 4 More precisely, there exists a set of autoregressive (AR) coefficiets ω 0,, ω mg of the followig stable AR regressio have zero meas ad zero serial correlatios: m g such that the error terms gq ω (θ 0, ψ 0 ; z t ) ω j g Q (θ 0, ψ 0 ; x t j, y t j ) (1) j=0 where z t x t j, y t j : j = 0,, m g. This assumptio does ot offer the most geeral settig; however, it should provide a fair approximatio to the dyamics of momet coditios i may cases studied i fiace ad ecoomics, ad more importat, it allows a factorizatio of the likelihood for fiite-sample iterpretatio while guarateeig the first-order asymptotic efficiecy of the aalogous maximum likelihood estimator. Importatly, as a result of stability, the origial momet coditio E g Q (θ 0, ψ 0 ; x t, y t )] = 0 is equivalet to the momet coditio: E g ω Q (γ 0; z t )] = 0, (13) The idea of ackowledgig that it is ofte very difficult to come up with precise distributios to be used as likelihood fuctios ad thus choosig a likelihood (or a prior) amog a set of samplig models (or priors) i Bayesia aalysis is referred to as robust Bayesia aalysis (see, e.g. Wasserma, 199; Berger, 1994; Pericchi ad Pérez, 1994). 3 Alteratively, the empirical likelihood (EL) of Owe (1988, 1990, 1991) ad the expoetial tilted empirical likelihood (ETEL) of Kitamura ad Stutzer (1997) ad Scheach (007) ca also be used as the likelihood part of the Bayesia iferece. I fact, they are the Bayesia empirical likelihood (BEL) of Lazar (003) ad the Bayesia expoetial tilted empirical likelihood (BETEL) of Scheach (005). Oe applicatio of BEL ad BETEL is the statistical aalysis of disaster risk models i Julliard ad Ghosh (01). 4 The same assumptio is also adopted by Kim (00, Remarks 1 ad 3). More details ca be foud i Assumptio A9 i Appedix 1.4. The assumptio ca be weakeed to the case that the lag size m g icreases with sample size. 5

where we defie γ (θ, ψ) ad γ 0 (θ 0, ψ 0 ) for otatioal simplicity. Give each γ, we defie the set of probability measures, deoted by Q(γ; ), such that Q(γ; ) Q : E Q g ω Q (γ; z t)] = 0, t = 1,,. (14) The true distributio of z = (z 1,, z ), deoted by Q, belogs to the distributio set Q(γ 0 ; ). The limited-iformatio likelihood Q γ, for Bayesia aalysis of full model is chose accordig to the priciple of miimum Kullback-Leibler divergece: Q γ, = argmi D KL (Q Q ) = argmi Q Q(γ;) Q Q(γ;) l (dq /dq ) dq (15) where dq /dq is the Rado-Nikodym derivative (or desity) of Q with respect to the true probability measure Q. It is well-kow that the limited-iformatio likelihood Q γ, has the followig Gibbs caoical desity (see, e.g., Csiszár, 1975; Cover ad Thomas, 1991, Chapter 11): dq γ, /dq = exp η Q (γ) T gq ω (γ; z t) A Q (γ), γ (16) ] where A Q (γ) l E e η Q(γ) T gq ω(γ;z t), ad the Lagragia multipliers η Q (γ) are chose to make the momet coditios satisfied: 0 = E gq ω (γ; z)eη Q(γ) T gq ω(γ;z)], γ. (17) We deote π Q (z t ; γ) exp η Q (γ) T g ω Q (γ; z t) A Q (γ). The posterior desity is π Q (γ x, y ) π(γ) exp η Q (γ) T gq ω (γ; z t) A Q (γ). (18) More discussios o the validity of the limited-iformatio likelihood ad Bayesia aalysis ca be foud i Appedix 1.1. Aalogously, the limited-iformatio likelihood P θ, for the baselie model is costructed based o the baselie momet coditios E g P (θ, x t )] = 0 ad the priciple of miimum Kullback-Leibler divergece (15). It has the desity fuctio: dp θ, /dp = exp η P (θ) T gp ω (θ; w t) A P (θ) (19) where w t = x t j : j = 0,, m g with uderlyig true distributio P ; gp ω(θ; w t) is the correspodig sub-vector of smoothed momets gq ω(θ; z t) for the baselie model; the fuctios η P (θ) ad A P (θ) are defied aalogous to η Q (γ) ad A Q (γ). We deote π P (w t ; θ) exp η P (θ) T gp ω(θ; w t) A P (θ). The MLE for P θ,, deoted by ˆθ ML P, also satisfies asymptotic ormality with variace I P(θ 0 ) 1. 6

Similar to full model, the posterior of baselie model is π P (θ x ) π(θ) exp η P (θ) T gp ω (θ; w t) A P (θ). (0) Without loss of geerality, we assume that g P (θ; x t ) = gp ω(θ; w t) ad g Q (γ; x t, y t ) = gq ω(γ; z t). It should also be oted that the fuctios i the limited-iformatio likelihoods η P (θ), A P (θ), η Q (γ), ad A Q (γ) are ot kow. It is iocuous for our theoretical exercises ad results. However, i practice, they eed to be replaced by their approximatig couterparts. For example, η P (θ) ca be estimated by solvig the followig equatio, for each θ, 0 = 1 g P (θ; x )e η P(θ) T g P (θ;x t ) t (1) ad the we estimate A P (θ) by ÂP(θ) as follows: Â P (θ) = l 1 e η P(θ) T g P (θ;x t ) ]. () The fuctioal estimators η Q (γ) ad ÂQ(γ) ca be costructed i the similar way. Uder the regularity coditios, the fuctioal estimators coverge to the true fuctios uiformly i probability. Discussio: The Validity of Limited-Iformatio Likelihood Q γ, The GMM Bayesia methods of Kim (00) ad Cherozhukov ad Hog (003) are primarily for large sample aalysis ad potetially ivalid for fiite-sample iterpretatio due to the fact that the GMM framework based o a quadratic form empirical momets uder geeral assumptios are justified asymptotically (see, Hase, 198). Specifically, Cherozhukov ad Hog (003) motivate the expoetial of momet coditios quadratic form as the limited-iformatio likelihood maily from a computatioal perspective, without providig a compellig theoretical ratioalizatio. Kim (00) justifies the particular expoetial quadratic form by appealig to the priciple of miimum Kullback-Leibler divergece yet based o momet coditios i a asymptotic sese. To guaratee a valid fiite-sample iterpretatio ad valid parametric family for likelihoods, we adopt the framework of Kitamura ad Stutzer (1997). Here are several reasos for the limited-iformatio likelihood i (16) to be used as a valid parametric family for likelihood withi the Bayesia paradigm. First, the set of distributios Q γ, characterize a proper parametric family of likelihoods for statistical iferece, sice Q γ, is the true distributio if ad oly if γ = γ 0, that is η Q (γ 0 ) = A Q (γ 0 ) = 0. Further, the MLE of the parametric family Q γ,, deoted by ˆγ ML Q, is asymptotically first-order equivalet to the GMM estimator Hase (198) ad the ET estimator Kitamura ad Stutzer (1997) with a ormal asymptotic distributio 7

(see Sectio 1.6): ) wlim (ˆγ Q ML γ 0 = N(0, I Q (γ 0 ) 1 ), with I Q (γ 0 ) = E l (dq γ, /dq ) ]. It should be oted that Kitamura ad Stutzer (1997) propose the expoetial tilted estimator based o Fechel duality: ] η Q (γ) = argmi E e ηt g Q (γ;z), γ. (3) η Their motivatio is computatioal. But we are ot proposig ew estimatio method here. Thus, we directly cosider the maximum likelihood estimator for the limited-iformatio likelihood family Q γ,. Secod, the limited-iformatio likelihood (16) ca be factorized properly so as to be cosistet with the statioary Markovia properties of uderlyig time series, sice the sequetial depedece of Q γ, are fully captured by the true distributio Q. Third, the Bayesia aalysis based o (16) is equivalet to Bayesia expoetially tilted empirical likelihood (BETEL) asymptotically (see, e.g., Julliard ad Ghosh, 01). Scheach (005) provides a probabilistic iterpretatio of the expoetial tilted empirical likelihood that justifies its use i Bayesia iferece. 1. The Effective-Sample Size We quatify the discrepacy betwee probability distributios usig a stadard statistical measure, the relative etropy (also kow as the Kullback-Leibler divergece). The relative etropy betwee π P (θ x ) ad the margial posterior π Q (θ x, y ) is D KL π Q (θ x, y ) π P (θ x )] = l ( πq (θ x, y ) ) π P (θ x π Q (θ x, y )dθ. (4) ) Ituitively, we ca thik of the log posterior ratio l(π Q (θ x, y )/π P (θ x )) as a measure of the discrepacy betwee the two posteriors at a give θ. The the relative etropy is the average discrepacy betwee the two posteriors over all possible θ, where the average is computed uder the costraied posterior. D KL (π Q (θ x, y ) π P (θ x )) is fiite if ad oly if the support of the posterior π Q (θ x, y ) is a subset of the support of the posterior π P (θ x ), that is, Assumptio A6 i Subsectio 1.4 holds. The magitude of relative etropy is difficult to iterpret directly, ad we propose a ituitive effective sample size iterpretatio. Istead of imposig the cross-equatio restrictios from the structural model, oe ca gai extra iformatio about θ withi the baselie model with additioal data. We evaluate the amout of additioal data uder the baselie model eeded to match the iformativeess of cross-equatio restrictios. 8

Suppose we draw additioal data x m of sample size m from the Bayesia predictive distributio π P ( x m x ) π P ( x m θ)π P (θ x )dθ, (5) where the effective sample x m is idepedet of observed sample x give baselie parameters θ, that is, π P ( x m, θ x ) = π P ( x m θ)π P (θ x ). Agai, we measure the gai i iformatio from this additioal sample x m usig relative etropy, D KL (π P (θ x m, x ) π P (θ x )) = ( πp (θ x m, x ) ) l π P (θ x π P (θ x m, x )dθ. (6) ) D KL (π P (θ x m, x ) π P (θ x )) depeds o the realizatio of the additioal sample of data x m. The average relative etropy (iformatio gai) over possible future samples x m accordig to the Bayesia predictive distributio π P ( x m x ) equals the mutual iformatio betwee x m ad θ give x : I( x m ; θ x ) E xm x DKL (π P (θ x m, x ) π P (θ x )) ] = D KL (π P (θ x m, x ) π P (θ x )) π P ( x m θ) π P (θ x ) d x m dθ. (7) Like the relative etropy, the mutual iformatio is always positive. It is easy to check that I( x m ; θ x ) = 0 whe m = 0. Uder the assumptio that the prior distributio is osigular ad the parameters i the likelihood fuctio are well idetified, ad additioal geeral regularity coditios, I( x m ; θ x ) is mootoically icreasig i m ad coverges to ifiity as m icreases. These properties esure that we ca fid a extra sample size m that equates (approximately, due to the fact that m is a iteger) D KL (π Q (θ x, y ) π P (θ x )) with I( x m ; θ x ). It is oly meaigful to match 1-dimesioal distributios eve i asymptotic seses. Thus, the results i this sectio are valid oly for scalar feature fuctios. Defiitio 1 (Effective-Sample Size Iformatio Measure). For a feature fuctio vector f : R D Θ R, we defie the effective-sample measure of the iformativeess of the cross-equatio restrictios as ϱ f KL (x, y ) = + m f, (8) where m f eables the matchig of two iformatio quatities I( x m f ; f(θ) x ) D KL (π Q (f(θ) x, y ) π P (f(θ) x )) < I( x m f +1 ; f(θ) x ), (9) with D KL (π Q (f(θ) x, y ) π P (f(θ) x )) beig the relative etropy betwee the costraied ad ucostraied posteriors of f(θ) ad I( x m ; f(θ) x ) beig the coditioal mutual iformatio betwee the additioal sample of data x m ad the trasformed parameter f(θ) give the existig sample of data x. 9

For scalar-valued feature fuctios (D f = 1), there exists a direct coectio betwee our Fisher fragility measure ad the relative-etropy based iformativeess measure (i.e. effective-sample size iformatio measure i Defiitio 1). The effective-sample size ratio ϱ f KL (x, y ) is defied with fiite-sample validity. The followig theorem establishes asymptotic equivalece betwee the Fisher fragility measure ϱ v (θ 0 ψ 0 ) ad the effective-sample size iformatio measure ϱ f KL (x, y ). Ituitively, the extra effective-sample size m f icreases proportioally i the observed sample size where the limitig proportioal growth rate wlim ϱ f KL (x, y ) is stochastic. Theorem 1. Cosider a feature fuctio f : R D Θ regularity coditios A1 - A8 stated i Subsectio 1.4, it must hold that R with v = f(θ 0 ). Uder the stadard wlim l ϱf KL (x, y ) = l ϱ v (θ 0 ψ 0 )] + 1 ϱ v (θ 0 ψ 0 ) 1] (χ 1 1), (30) where χ 1 is a chi-square radom variable with degrees of freedom 1. It immediately implies that ] E wlim l ϱf KL (x, y ) = l ϱ v (θ 0 ψ 0 )]. The result of Theorem 1 follows immediately from the followig approximatio results summarized i Theorem ad Theorem 3. Without loss of geerality, we assume that g ω Q (θ, ψ; z t) g Q (θ, ψ; x t, y t ) or equivaletly m g = 0 i Equatio (1). Also, because of Assumptio A7 (the regular feature fuctio coditio), as well as the fact that the defiitio of our dark matter measure i Che, Dou, ad Koga (017) ad regularity assumptios A1 - A6 i Appedix 1.4 are ivariat uder ivertible ad secod-order smooth trasformatios of parameters, we ca assume that D Θ = 1 i the ituitive proofs. The full proofs are i the olie appedix. Theorem (Kullback-Leibler Divergece). Cosider a feature fuctio f : R D Θ R with v = f(θ 0 ). Let the MLE for the limited-iformatio likelihood of the baselie model P be ˆθ P ML, ad the aalogy of the structural model Q be ˆγ ML Q = (ˆθ ML Q, ˆψ ML Q ). Uder the regularity coditios stated i Appedix 1.4, D KL (π Q (f(θ) x, y ) π P (f(θ) x )) 1 l where covergece is i probability uder Q. vi Q (θ 0 ψ 0 ) 1 v T (f( θ P ML) f( θ Q ML)) vi P(θ 0 ) 1 v T vi Q (θ 0 ψ 0 ) 1 v T + 1 vi Q (θ 0 ψ 0 ) 1 v T vi P (θ 0 ) 1 v T 1/, (31) This theorem geeralizes the results i Li, Pittma, ad Clarke (007, Theorem 3) i two importat aspects. It focuses o the results of the momet-based LIL framework, istead of the stadard likelihood-based framework. Ad also, it allows for geeral weak depedece amog the observatios which makes our results applicable to time series models i fiace ad ecoomics. 10

Here we provide a ituitive proof. The detailed ad rigorous techical proof ca be foud i Subsectio 1.10. Whe is large, it followig approximatios hold uder the stadard coditios: π P (f(θ) x ) N π Q (f(θ) x, y ) N Therefore, whe is large, the asymptotic approximatio (31) holds. ( f(ˆθ P ML), 1 vi P (θ 0 ) 1 v T ), ad (3) ( f(ˆθ Q ML), 1 vi Q (θ 0 ψ 0 ) 1 v T ). (33) Corollary 1. Cosider the feature fuctio f : R D Θ R with v = f(θ 0 )/ θ. Uder the assumptios i Subsectio 1.4, if we defie the it follows that Moreover, wlim + λ vt I P (θ 0 ) 1 v v T I Q (θ 0 ) 1 v, v T I Q (θ 0 ) 1 v (f( θ P ) f( θ Q )) = (1 λ 1 )χ 1. (34) wlim D KL(π Q (f(θ) x, y ) π P (f(θ) x )) = 1 λ 1 (χ + 1 1) + 1 l(λ), (35) The asymptotic distributio result i (34) is directly from Propositio 11. The asymptotic result i (35) is based o Theorem, limit result (34), ad the Slutsky Theorem. Theorem 3 (Mutual Iformatio). Cosider a feature fuctio f : R D Θ R with v = f(θ 0 ). Uder the assumptios i Subsectio 1.4, if m/ ς (0, ) as both m ad approach ifiity, where covergece is i probability uder Q. I( x m ; f(θ) x ) 1 ( ) m + l 0, (36) There exist related approximatio results for mutual iformatio I( x m ; θ x ), which cosider large m while holdig the observed sample size fixed. For more details, see Clarke ad Barro (1990, 1994) ad refereces therei. See also the case of o-idetically distributed observatios by Polso (199), amog others. Our results differ i that we allow both m ad to grow. Ours is a techically otrivial extesio of the existig results. Here we provide a ituitive proof. The detailed ad rigorous techical proof ca be foud i 11

Subsectio 1.10. By defiitio i (7), the mutual iformatio give x ca be rewritte as I( x m ; θ x ) = π P ( x m θ)π P (θ x ) l π P( x m, x θ)π P (x ) π P ( x m, x )π P (x θ) d xm dθ = π P ( x m θ)π P (θ x ) l π P( x m, x θ) π P ( x m, x ) d xm dθ π P (θ x ) l π P(x θ) π P (x ) dθ. The followig approximatio is stadard i the iformatio-theoretic literature (see, e.g., Clarke ad Barro, 1990, 1994): whe m ad are large, it holds that l π P( x m, x θ) π P ( x m, x ) 1 S m+ (θ) T I P (θ) 1 Sm+ (θ) + 1 l m + π + l 1 π P (θ) + 1 l I P(θ), where S m+ (θ) Similarly, it also holds that 1 m + l π P (x t ; θ) + ] m l π P ( x t ; θ). (37) l π P(x θ) π P (x ) 1 S (θ) T I P (θ) 1 S (θ) + 1 l π + l 1 π P (θ) + 1 l I P(θ), (38) where It follows from (37) ad (39) that where S (θ) 1 l π P (x t ; θ). (39) m S m+ (θ) = m + S (θ) + m + S m (θ), (40) S m (θ) 1 m l π P ( x t ; θ). (41) m Now recall that the family of limited-iformatio likelihoods π P (x; θ) satisfy the stadard properties: θ, E π P (x; θ)] = 1 ad E π P (x; θ) l π P (x; θ)] = 0 ad E π P (x; θ) l π P (x; θ)] l π P (x; θ)] T = E π P (x; θ) l π P (x; θ) = I P (θ). Thus, the followig importat idetity ca be derived: S m+ (θ) T I P (θ) 1 Sm+ (θ)π P ( x m θ)π P (θ x )d x m dθ = S (θ) T I P (θ) 1 S (θ)π P (θ x )dθ + m m + m +. 1

Usig the Taylor expasio of S (θ) aroud ˆθ ML P ad the ormal approximatio of posterior (3), S (θ) T I P (θ) 1 S (θ)π P (θ x )dθ I P (θ 0 ) 1 1 1 ] ( l π P (x t ; ˆθ ML) P θ ˆθ P ML) ϕp, (θ)dθ where ϕ P, (θ) = det 1 I P (θ 0 )] /(π) D Θ exp 1 (θ ˆθ P ML )T I P (θ 0 )] (θ ˆθ P ML ). Therefore, whe m ad are large, the mutual iformatio ca be approximated as follows I( x m ; θ x ) 1 l m + + m m + 1 l m +. S (θ) T I P (θ) 1 S (θ)π P (θ x )dθ m ] m + Now we have completed the ituitive proof. The legthy full proof i Subsectio 1.9 follows the same mai idea, yet with rigorous establishmets of the approximatio sigs above. 1.3 Geeric Notatios ad Defiitios First, we itroduce some otatios for the matrices. For ay real symmetric o-egative defiite matrix A, we defie λ M (A) to be the largest eigevalue of A ad defie λ m (A) to be the smallest eigevalue of A. For a matrix A, we defie the spectral orm of A to be A S. By defiitio of spectral orm, we kow that for ay real matrix A, A S λ M (A T A). Deote λ(θ) ad λ(θ) to be the largest eigevalue ad the smallest eigevalue of I P (θ), respectively. That is, λ(θ) = λ m (I P (θ)) ad λ(θ) = λ M (I P (θ)). If the matrix I P (θ) is cotiuous i θ, λ(θ) ad λ(θ) are cotiuous i θ. We defie upper boud ad lower boud to be λ sup λ(θ), ad λ if λ(θ). (4) θ Θ θ Θ Secod, we itroduce some otatios related to subsets i Euclidea spaces. We defie the Euclidea distace betwee two sets S 1, S R D Θ as follows d L (S 1, S ) if s 1 s : s i S i, i = 1,. (43) For θ R D Θ, we deote θ (1) to be the first elemet of θ ad deote θ ( 1) to be the D Θ 1 dimesioal vector cotaiig all elemets of θ other tha θ (1). Defie the ope ball cetered at θ with radius r 13

to be Deote Ω(θ, r) ϑ : ϑ θ < r ad Ω(θ, r) ϑ : ϑ θ r Ω (1) (θ, r) = Ω(θ (1), r) R 1, ad Ω ( 1) (θ, r) = Ω(θ ( 1), r) R D Θ 1. I additio, we defie Θ 1 (θ (1) ) θ ( 1) R D Θ 1 ( θ (1) θ ( 1) ) Θ, (44) ad we deote V Θ Vol(Θ) < +, V Θ (θ (1) ) Vol(Θ 1 (θ (1) )) < +, ad V Θ,1 sup θ (1) Θ (1) V Θ (θ (1) ) < +. Third, we itroduce some otatios o metrics of probability measures. Cosider two probability measures P ad Q with desities p ad q with respect to Lebesgue measure, respectively. The Helliger affiity betwee P ad Q is deoted as α H (P, Q) p(x)q(x)dx. The total variatio distace betwee P ad Q is deoted as P Q T V p(x) q(x) dx. Fourth, we itroduce otatios for time series. The maximal correlatio coefficiet ad the uiform mixig coefficiet are defied as ρ max (F 1, F ) sup Corr(f 1, f ), f 1 L real (F 1),f L real (F ) ad φ(f 1, F ) sup P (A A 1 ) P (A ), A 1 F 1,A F where L real(f i ) deote the space of square-itegrable. F i -measurable, real-valued radom variables, for ay sub σ fields F i F. 1.4 Regularity Coditios for Theoretical Results The regularity coditios we choose to impose o the behavior of the data are iflueced by three major cosideratios. First, our assumptios are chose to allow processes of sequetial depedece. I particular, the processes allowed should be relevat to itertemporal asset pricig models. Secod, our assumptios are required to meet the aalytical tractability. Third, our assumptios are 14

sufficiet coditios i the sese that we are ot tryig to provide the weakest coditios or high level coditios to guaratee the results; but istead, we chose those regularity coditios which are relatively straightforward to check i practice. Assumptio A1 (Statioarity Coditio) The uderlyig time series (x t, y t ) with t = 1,, follow a m S -order strictly statioary Markov process. Remark. This assumptio implies that the margial coditioal desity for x t ca be specified as π P (x t θ, x t 1,, x t ms ). Defie the stacked vectors x t = (x t,, x t ms +1) T ad y t = (y t,, y t ms +1) T. The the margial coditioal desity from the parametric family specified for the baselie model ca be rewritte as π P (x t ; θ), ad the stacked vectors (x t, y t ) follow a first-order Markov process. Assumptio A (Mixig Coditio) There exists costat λ D d D /(d D 1), where d D is the costat i Assumptio A3 (domiace coditio), such that (x t, y t ) for t = 1,,, is uiform mixig ad there exists a costat φ such that the uiform mixig coefficiets satisfy φ(m) φm λ D for all possible probabilistic models, where φ(m) is the uiform mixig coefficiet. Its defiitio is stadard ad ca be foud, for example, i White ad Domowitz (1984) or Bradley (005). Remark. Followig the literature (see, e.g. White ad Domowitz, 1984; Newey, 1985b; Newey ad West, 1987), we adopt the mixig coditios as a coveiet way of describig ecoomic ad fiacial data which allows time depedece ad heteroskedasticity. The mixig coditios basically restrict the memory of a process to be weak, while allowig heteroskedasticity, so that large sample properties of the process are preserved. I particular, we employ the uiform mixig which is discussed i White ad Domowitz (1984). Assumptio A3 (Domiace Coditio) The fuctio g Q (θ, ψ; x, y) is twice cotiuously differetiable i (θ, ψ) almost surely. There exist domiatig measurable fuctios a 1 (x, y) ad a (x, y), ad costat d D > 1, such that almost everywhere g Q (θ, ψ; x, y) a 1 (x, y), g Q (θ, ψ; x, y) S a 1(x, y), g Q,(i) (θ, ψ; x, y) S a 1(x, y), for i = 1,, D g, q(x, y) a (x, y), q(x 1, y 1, x t, y t ) a (x 1, y 1 )a (x t, y t ), for t, a 1 (x, y)] d D a (x, y)dxdy < +, a (x, y)dxdy < +, where S is the spectral orm of matrices. 15

Remark. The domiatig fuctio assumptio is widely adopted i the literature of geeralized method of momets (Newey, 1985a,b; Newey ad West, 1987). The domiatig assumptio, together with the uiform mixig assumptio ad statioarity assumptio, imply the stochastic equicotiuity coditio (iv) i Propositio 1 of Cherozhukov ad Hog (003). I the semial GMM paper by Hase (198), the momet cotiuity coditio ca also be derived from the domiace coditios. Remark. Followig the discussio of Sectio 1, for each pair of j ad k, it holds that for some costat ζ > 0 ad large costat C > 0, E sup l π P (w; θ) θ j θ k E θ Θ sup γ Θ Ψ +ζ l π Q (z; γ) γ j γ k < C, ad E sup l π P (w; θ) θ j +ζ θ Θ < C, ad E sup γ Θ Ψ +ζ l π Q (z; γ) γ j < C, ad (45) +ζ < C, (46) where γ = (θ, ψ), ad π P ad π Q are defied i Sectio 1. The domiace coditio, together with the uiform mixig assumptio ad statioarity assumptio, implies the stochastic equicotiuity coditio (i) i Propositio 3 of Cherozhukov ad Hog (003). Assumptio A4 (Nosigular Coditio) The Fisher iformatio matrices I P (θ) ad I Q (θ, ψ) are positive defiite for all θ, ψ. Remark. It implies that the covariace matrices S P ad S Q are positive defiite, ad the expected momet fuctio gradiets G P (θ) ad G Q (θ, ψ) have full rak for all θ ad ψ. Assumptio A5 (Idetificatio Coditio) The true baselie parameter vector θ 0 is idetified by the baselie momet coditios i the sese that E g P (θ; x)] = 0 oly if θ = θ 0. Ad, the true parameters (θ 0, ψ 0 ) of the full model is idetified by the momet coditios i the sese that E g Q (θ, ψ; x, y)] = 0 oly if θ = θ 0 ad ψ = ψ 0. Remark. Cosider the discussio of limited-iformatio likelihood i Sectio 1. The cotiuous differetiability of momet fuctios, together with the idetificatio coditio, imply that the parametric family of limited-iformatio distributios P θ ad Q γ, as well as the momet coditios, are soud: the covergece of a sequece of parameter values is equivalet to the weak covergece of the distributios: θ θ 0 P θ P θ0 E l (dp θ /dp)] E l (dp θ0 /dp)] = 0, ad (47) γ γ 0 Q γ Q γ0 E l (dq γ /dq)] E l (dq γ0 /dq)] = 0, (48) where γ = (θ, ψ) ad γ 0 = (θ 0, ψ 0 ). Ad, the covergece of a sequece of parameter values is equivalet to the covergece of the momet coditios: γ γ 0 E g Q (γ; x, y)] E g Q (γ 0 ; x, y)] = 0. (49) 16

Assumptio A6 (Regular Bayesia Coditio) Suppose the parameter set is Θ Ψ R D Θ+D Ψ with Θ ad Ψ beig compact. Ad, the prior is absolutely cotiuous with respect to the Lebesgue measure with Rado-Nykodim desity π(θ, ψ), which is twice cotiuously differetiable ad positive. Deote π max θ Θ,ψ Ψ π(θ, ψ) ad π mi θ Θ,ψ Ψ π(θ, ψ). The probability measure defied by the limited-iformatio posterior desity π Q (θ x, y ) is domiated by the probability measure defied by the baselie limited-iformatio posterior desity π P (θ x ), for almost every x, y uder Q 0. Remark. Compactess implies total boudess. I our diaster risk model, the parameter set for the prior is ot compact due to the adoptio of uiformative prior. However, i that umerical example, we ca trucate the parameter set at very large values which will ot affect the mai umerical results. Remark. The cocept of domiatig measure here is the oe i measure theory. More precisely, this regularity coditio requires that for ay measurable set which has zero measure uder π Q (θ x, y ), it must also have zero measure uder π P (θ x ). This assumptio is just to guaratee that D KL (π Q (θ x, y ) π P (θ x )) to be well defied. Assumptio A7 (Regular Feature Fuctio Coditio) The feature fuctio f = (f 1,, f Df ) : Θ R D f is a twice cotiuously differetiable vectorvalued fuctio. We assume that there exist D Θ D f twice cotiuously differetiable fuctios f Df +1,, f DΘ o Θ such that F = (f 1, f,, f DΘ ) : Θ R D Θ is a oe-to-oe mappig (i.e. ijectio) ad F (Θ) is a coected ad compact D Θ -dimesioal subset of R D Θ. Remark. A simple sufficiet coditio for the regular feature fuctio coditio to hold is that each fuctio f i (i = 1,, D f ) is a proper ad twice cotiuously differetiable fuctio o R D Θ f(θ) ad (θ (1),, θ (Df )) > 0 at each θ RD Θ. I this case, we ca simply choose f k (θ) θ (k) for k = D f + 1,, D Θ. The, the Jacobia determiat of F is ozero at each θ R D Θ ad F is proper ad twice differetiable mappig R D Θ R D Θ. Accordig to the Hadamard s Global Iverse Fuctio Theorem (e.g. Kratz ad Parks, 013), F is a oe-to-oe mappig ad F (Θ) is a coected ad compact D Θ -dimesioal subset of R D Θ. Assumptio A8 (Expoetial Coditio) For sufficietly small δ > 0, E sup γ Ω(γ,δ) exp v T g Q (γ; x, y) ] <, for all vectors v i a eighborhood of the origi. Remark. This assumptio is eeded to guaratee that the limited-iformatio likelihood Q γ i (16) is well-defied ad the related Fechel duality holds. While this assumptio is stroger tha the momet existece assumptio i Hase (198), it is commoly adopted i the literature ivolvig Kullback-Leibler divergece (see, e.g. Csiszár, 1975; Kitamura ad Stutzer, 1997) ad geeral expoetial-family models (see, e.g. Berk, 197; Dou, Pollard, ad Zhou, 01). 17

1.5 Lemmas As emphasized by White ad Domowitz (1984), the mixig coditios serve as a operatig assumptio for ecoomic ad fiacial processes, because mixig assumptios are difficult to verify or test. However, White ad Domowitz (1984) argues that the restrictio is ot so restrictive i the sese that a wide class of trasformatios of mixig processes are themselves mixig. Lemma 1. Let z t = Z ((x t+τ0, y t+τ0 ),, (x t+τ, y t+τ )) where Z is a measurable fuctio oto R Dz ad two itegers τ 0 < τ. If x t, y t is uiform mixig with uiform mixig coefficiets φ(m) Cm λ for some λ > 0 ad C > 0, the z t is uiform mixig with uiform mixig coefficiets φ z (m) C z m λ for some C z > 0. Proof. It ca be derived directly from the defiitio of uiform mixig. The followig classical result is put here for easy referece. Ad, it easily leads to a corollary which will be used repeatedly. Lemma. For ay two σ fields F 1 ad F, it holds that ρ max (F 1, F ) φ(f 1, F )] 1/. Proof. The proof ca be foud i Ibragimov (196) or Doob (1950, Lemma 7.1). Corollary. Let z t be strictly statioary process satisfyig uiform mixig such that φ(m) Cm λ for some λ > 0 ad C > 0, the the autocorrelatio fuctio where z (i),t is the i-th elemet of z t. max Corr(z (i),t, z (j),t+m ) Cm λ/ i,j Proof. It directly follows from Lemma 1 ad Lemma. Lemma 3. Let z t be a sequece of strictly statioary radom vectors such that E P0 z t < + ad it satisfies the uiform mixig coditio with φ(m) Cm λ for some λ > ad C > 0. The, lim var 0 ( 1 z t) = V0 < +. Proof. Let z (i),t be the i-th elemet of vector z t. Deote σ i var 0(z (i),t ) for each i. Ad, we deote the cross correlatio to be ρ i,j (τ) Corr(z (i),t, z (j),t+τ ) for all t, τ, i ad j. The, we have, for each pair of i ad j, Cov 0 ( 1 z (i),t, 1 ) z (j),t = σ i σ j ρ i,j (0) + 1 ρ i,j(1) + + 1 ] ρ i,j( 1). Accordig to Corollary, we kow that ρ i,j (m) = o(m 1 ). Thus, by verifyig the Cauchy coditio, we kow that ρ i,j (0) + 1 ρ i,j(1) + + 1 ρ i,j( 1) coverges to a fiite costat. 18

The followig two lemmas are extesios of Propositios 6.1-6. i Clarke ad Barro (1990, Page 468-470). Lemma 4 shows that aalogs of the soudess coditio for certai metrics o probability measures imply the existece of strogly uiformly expoetially cosistet (SUEC) hypothesis tests. A composite hypothesis test is called uiformly expoetially cosistecy (UEC) if its type-i ad type-ii errors are uiformly upper bouded by e ξ for some positive ξ, over all alteratively (see e.g. Barro, 1989). A strogly uiformly expoetially cosistet (SUEC) test is a hypothesis test whose type-i ad type-ii errors are upper bouded by e ξ for some positive costat ξ, uiformly over all alteratives ad all ull parametric models over two subsets i the probability measure space. Lemma 5 shows that metrics with the desirable cosistecy property exists, which exteds Propositio 6. i Clarke ad Barro (1990). Lemma 4. Suppose d G is a metric o the space of probability measures o X with the property that for ay ɛ > 0, there exists ξ > 0 ad C > 0 such that P d G ( P, P) > ɛ Ce ξ, (50) uiformly over all probability measures P, where P is the empirical distributio. Ad, the metric d G also satisfies d G (P θ, P θ ) 0 θ θ. (51) The, for ay δ > δ 1 0 ad for each θ Ω(θ 0, δ 1 ), there exists a SUEC hypothesis test of θ Ω(θ 0, δ 1 ) versus alterative Ω(θ 0, δ). Proof. The proof is a extesio based o that of Lemma 6.1 i Clarke ad Barro (1990). From (51), for ay give δ > δ 1 0, there exists ɛ 1 > 0 such that d L (θ, Ω(θ 0, δ) ) > δ δ 1 > 0 implies that d G (P θ, P θ ) > ɛ 1 for all θ Ω(θ 0, δ). Thus, for ay δ > δ 1 0, there exists ɛ 1 > 0 such that, d G (P θ, P θ ) > ɛ 1 for all θ Ω(θ 0, δ 1 ) ad θ Ω(θ 0, δ). Therefore, for each θ Ω(θ 0, δ 1 ), if we have a SUEC test of H 0 : P = P θ versus H A : P P : d G ( P, P θ ) > ɛ 1, the we have a SUEC test of H 0 : θ = θ versus H A : P P θ : θ θ > δ δ 1. Let P be the empirical distributio. We choose ɛ = ɛ 1 / ad let A θ, x : d G ( P, P θ ) ɛ be the acceptace regio. By the coditio (50), we have that the probability of type-i error satisfies, for some ξ > 0 ad C > 0, P θ, A θ, Ce ξ, uiformly over θ Ω(θ 0, δ 1 ). 19

We wat to show that the probability of type-ii error P θ A θ, is expoetially small uiformly over θ Ω(θ 0, δ) ad θ Ω(θ 0, δ 1 ). More precisely, usig triagular iequality, we kow that for ay θ Ω(θ 0, δ 1 ) ad θ Ω(θ 0, δ) ɛ d G (P θ, P θ ) d G (P θ, P ) + d G ( P, P θ ). (5) By defiitio ad iequality (5), for each θ Ω(θ 0, δ 1 ), o the acceptace regio A θ,, we have ɛ < ɛ + d G ( P, P θ ) for ay θ Ω(θ 0, δ). (53) Thus, for each θ Ω(θ 0, δ 1 ), o the acceptace regio A θ,, it holds that d G ( P, P θ ) > ɛ, for ay θ Ω(θ 0, δ). Therefore, for each θ Ω(θ 0, δ 1 ) ad each θ Ω(θ 0, δ), we have P θ,a θ, P θ, d G ( P, P θ ) > ɛ Ce ξ. Lemma 5. For a space of probability measures deoted by P(Ψ, λ), o a separable metric space X such that the mixig coefficiet φ(m) Ψm λ with Ψ > 0 ad λ beig uiversal costats, there exists a metric d G (P, Q) for probability measure P ad Q i P(Ψ, λ) that satisfies the property (50) ad such that covergece i d G implies the weak covergece of the measures. Thus, i particular, for probability measures i a parametric family d G (P θ, P θ ) 0 P θ P θ. (54) Proof. We exted the proof for the existece result i Propositio 6. i Clarke ad Barro (1990) to allow for weak depedece. Let F i : i = 1,, be the coutable field of sets geerated by balls of the form x : d X (x, s j1 ) 1/j for j 1, j = 1,,, where d X deotes the metric for the space X ad s 1, s, is a coutable dese sequece of poits i X. Defie a metric o the space of probability measures as follows d G (P, Q) = i P F i QF i. i=1 Accordig to Gray (1988, Page 51 53), if d G ( P, P) 0, the P coverges weakly to P. Now, for ay ɛ > 0, we choose k 1 l (ɛ) / l(). Thus, d G ( P, P) k i P F i PF i + i max P F i PF i + ɛ/. i=1 i=k+1 1 i k 0

The, we have P d G ( P, P) > ɛ P max 1 i k P F i PF i > ɛ/ k i=1 P P F i PF i > ɛ/ The Hoeffdig-type iequality for uiform mixig process (see, e.g. Roussas, 1996, Theorem 4.1) guaratees that there exists C 1 > 0 ad ξ > 0 such that sup P P F i PF i > ɛ/ C 1 e ξ. P P(Ψ,λ) Thus, with C = kc 1. P d G ( P, P) > ɛ Ce ξ, (55) Lemma 6 exteds the large deviatio result of Schwartz (1965, Lemma 6.1) to allow time depedece i the data process. Let z = (z 1,, z ) be strictly statioary ad uiform mixig with φ(m) = O ( m λ) for some positive λ. The process z have the joit desity P θ,. Deote the mixture distributio of the parametric family P θ, with respect to the coditioal prior distributio π P ( N ) to be P N, with desity π P(z N ). More precisely, we defie π P (z N ) π P (z θ)π P (θ N )dθ. (56) N Lemma 6. Assume that the mixig coefficiet power λ >. Suppose there exist strogly uiformly expoetially cosistet (SUEC) tests of hypothesis θ N 0 agaist the alterative θ N such that N 0 N with d L (N 0, N ) δ for some δ > 0. The, there exists ξ > 0 ad a positive iteger k such that, for all θ N 0, P θ, P N, (1 e ξm ), where m + k. T V Proof. We assume that there exists a sequece of SUEC tests, deoted by A, for ay sample with size. The, there exists a positive iteger k such that for all k P θ, A < 1 8 for all θ N 0 ad (57) For each j = 1,,, we defie P θ,a > 1 1 8 for all θ N. (58) A k,j = A k (z j+1,, z j+k ), (59) 1

the, accordig to Lemma 1, Y m = 1 m A k,j (60) m j=1 is a average of strictly statioary ad uiform mixig such that φ(m) φ m λ. The expectatio of Y m, uder distributio P θ,, is µ(θ) with µ(θ) = < 1 8 if θ N 0 > 7 8 if θ N. By Corollary ad the uiform mixig coditios with λ >, together with the fact that A k,j 0, 1], we kow that the assumptios of Theorem.4 i White ad Domowitz (1984) holds ad hece the CLT for the time series holds, i.e. (61) m 1/ Y m d N(µ(θ), V (θ)), (6) where V (θ) = lim m + P θ, m 1/ m j=1 (A k,j µ(θ))]. From Lemma 3, we kow that V (θ) V <. Therefore, the momet geeratig fuctios coverge m 1 l P θ, e tmym 1 t V (θ). (63) O the oe had, whe θ N, we have µ(θ) > 1 4, accordig to Theorem 8.1.1 of Taiguchi ad Kakizawa (000), we ca achieve the followig large deviatio result Therefore, there exists ξ 1 > 0 such that lim m + m 1 l P θ, Y m 1 = 1 4 3V (θ) 1 3V. (64) P θ, Y m 1 e ξ1m for all θ N. (65) 4 Thus, P N, Y m 1 = P θ, Y m 1 π P (θ N )dθ e ξ1m, for m + k. (66) 4 N 4 O the other had, whe θ N 0, we have µ(θ) < 1 4, accordig to Theorem 8.1.1 of Taiguchi ad Kakizawa (000), we obtai the large deviatio result Therefore, there exists ξ > 0 such that lim m + m 1 l P θ, Y m 1 = 1 4 3V (θ) 1 3V. (67) P θ, Y m 1 e ξm. (68) 4

For ξ = mi(ξ 1, ξ ), it follows that P θ, P N, = sup P θ, A P N, A (1 e ξm ) (69) T V A F for m + k by cosiderig A = Y m 1 4. We itroduce Le Cam s theory o hypothesis testig (see, e.g. Le Cam ad Yag, 000, Chapter 8). Lemma 7. Uder the regularity coditios i Subsectio 1.4, there are test fuctios A ad positive coefficiets C, ξ, ɛ ad K such that P 0, (1 A ) 0 ad P θ, A Ce ξ θ θ 0 / for all θ such that K/ θ θ 0 ɛ. Proof. For all z R K(Dx+Dy), defie the rectagular F z (, z 1 ] (, z ] (, z K(Dx+Dy)]. The empirical process is defied as Π (z) P F z = 1 1 z t F z. We defie Π θ (z) P θ F z. Accordig to Le Cam ad Yag (000, Page 50), there exists positive costats c ad ɛ such that sup x Π θ0 (x) Π θ (x) > c θ θ 0 for θ θ 0 ɛ. Deote the expectatio to be µ (θ) E Pθ sup x Π θ0 (x) Π θ (x). By the classical result of weak covergece for the Kolmogorov- Smirov statistic sup x Π θ0 (x) Π θ (x), we kow that there exists a large costat M such that µ (θ) M c for all θ θ 0 ɛ. We choose K 4M c. Cosider the test fuctios A = sup z Π θ0 (z) Π θ (z) < K/. Usig triagular iequality, we obtai P θ, A P θ, c θ θ 0 sup z Π (z) Π θ (z) µ (θ) Usig Dvoretzky-Kiefer-Wolfowitz type iequality for uiform mixig variables i Samso (000, Theorem 3), we ca show there exists ξ > 0 such that P θ, c θ θ 0 sup z for all K/ θ θ 0 ɛ. Thus, P θ, A e ξ θ θ 0 / Π (z) Π θ (z) µ (θ) e ξ θ θ 0 / same iequality, it is straightforward to get P 0, (1 A ) 0. for all K/ θ θ 0 ɛ. By the 1.6 Basic Properties of Limited-Iformatio Likelihoods The MLE for Limited-Iformatio Likelihood Q γ, is defied as L (γ) 1 The empirical log-likelihood fuctio l π Q (x t, y t ; γ), γ. (70) The maximum likelihood estimator ˆθ Q is defied as follows: ] ˆγ ML Q argmax L (γ) = argmax η Q (γ) T 1 g Q (γ; x t, y t ) A Q (γ). (71) γ γ 3

We shall show that the MLE ˆγ ML Q is cosistet ad asymptotic ormal with asymptotic efficiet variace-covariace matrix. This is importat to justify the use of our limited-iformatio likelihood Q γ of (16) i the Bayesia paradigm as a valid likelihood. The first-order coditio is 0 = L (ˆγ ML Q ) which is ] T 0 = η Q (ˆγ ML) Q 1 ] g Q (ˆγ ML; Q x t, y t ) + 1 ] T g Q (ˆγ ML; Q x t, y t ) η Q (ˆγ ML) Q A Q (ˆγ ML). Q The defiitio of ˆγ ML Q i (71) ad the properties of expected log-likelihood L(γ) plays a vital role i establishig the cosistecy of ˆγ ML Q. The first-order coditio (7) is crucial for obtaiig the asymptotic ormality of ˆγ ML Q. Now, let s cosider the expected log-likelihood of Q γ : (7) L(γ) η Q (γ) T g Q (γ) A Q (γ). (73) where g Q (γ) E g Q (γ; x, y)]. It holds that L(γ 0 ) = 0 sice η Q (γ 0 ) = 0 ad A Q (γ 0 ) = 0. Further, it holds that L(γ 0 ) = 0 sice L(γ 0 ) = η Q (γ 0 )] T g Q (γ 0 ) + g Q (γ 0 )] T η Q (γ 0 ) A Q (γ 0 ), ad g Q (γ 0 ) = 0 ad A Q (γ 0 ) = 0. The Jacobia matrix of L(γ) evaluated at γ 0 is equal to the egative Fisher iformatio matrix I Q (γ 0 ): L(γ 0 ) = A Q (γ 0 ) = η Q (γ 0 ) T g Q (γ 0 ) = G T Q S 1 Q G Q = I Q (γ 0 ). (74) Here the first equality is simply implied by the formula of L(γ) from (73). The secod equality of (74) is due to ] A Q (γ)e AQ(γ) = E η Q (γ) T g Q (γ; x, y)e η Q(γ) T g Q (γ;x,y) γ, (75) which is implied by the defiitio of A Q (γ) ad the coditio (17). The third equality of (74) is due to the relatio g Q (γ 0 ) = G Q ad the followig relatio implied by (17): η Q (γ 0 ) = S 1 Q G Q. (76) Ad, the fourth equality of (74) is simply due to the defiitio of I Q (γ 0 ). Aother importat property of expected log-likelihood of Q γ is the idetificatio of γ 0 i the followig sese: L(γ) L(γ 0 ) = 0, γ, (77) where the iequality is due to the Jese s iequality applied to A Q (γ). The Jese s iequality holds with equality if ad oly if η Q (γ) T g Q (γ; x, y) is costat almost surely uder Q which is true oly whe γ = γ 0. 4

Cosistecy We first prove the cosistecy of the estimator ˆγ ML Q. To our kowledge, the theoretical result for the MLE of limited-iformatio likelihood Q γ based o the priciple of miimum Kullback-Leibler divergece is ew, though we appeal to the stadard approach of Wald (1949) ad Wolfowitz (1949). Propositio 1. Uder Assumptios A1 - A5 ad A8 - A9, ˆγ Q ML coverges to γ 0 i probability. Proof. Our goal is to show that for ay ɛ > 0 there exists N such that Q ˆγ ML Q Ω(γ 0, ɛ) < ɛ for N, (78) where Ω(γ 0, ɛ) γ : γ γ 0 > ɛ is the ope ball cetered at γ 0 with radius ɛ. For all ɛ > 0, it holds that (due to the defiitio of ˆγ ML Q i (71)) Moreover, for all ɛ > 0 ad h > 0, it holds that 1ˆγ ML Q Ω(γ 0, ɛ) 1 sup L (γ) L (γ 0 ). (79) γ Ω(γ 0,ɛ) 1 sup L (γ) L (γ 0 ) = 1 L (γ 0 ) < h1 sup L (γ) L (γ 0 ) γ Ω(γ 0,ɛ) γ Ω(γ 0,ɛ) + 1 L (γ 0 ) h1 sup γ Ω(γ 0,ɛ) L (γ) L (γ 0 ) 1 L (γ 0 ) < h + 1 sup L (γ) h. (80) γ Ω(γ 0,ɛ) Combiig (78) ad (80), it suffices to show that for all ɛ > 0 there exists h > 0 ad N such that for N max Q L (γ 0 ) < h, Q sup L γ Ω(γ0,ɛ) (γ) h < ɛ/. (81) The first probabilistic boud i (81) is straightforward due to the LLN result wlim L (γ 0 ) = L(γ 0 ) = 0. (8) Now, we cosider the secod probabilistic boud i (81). From the domiace coditio (Assumptio A3) ad the remark of idetificatio coditio (Assumptio A5), we kow that lim sup δ 0 γ Ω(γ,δ) Thus, we have the followig Domiace Covergece result: l π Q (x, y; γ ) = l π Q (x, y; γ) a.s. Q for all γ. (83) lim E δ 0 sup l π Q (x, y; γ ) γ Ω(γ,δ) ] = L(γ). (84) 5