Adaptive Covariance Estimation with model selection

Σχετικά έγγραφα
1. For each of the following power series, find the interval of convergence and the radius of convergence:

Lecture 17: Minimum Variance Unbiased (MVUB) Estimators

On Generating Relations of Some Triple. Hypergeometric Functions

Homework for 1/27 Due 2/5

SUPERPOSITION, MEASUREMENT, NORMALIZATION, EXPECTATION VALUES. Reading: QM course packet Ch 5 up to 5.6

Other Test Constructions: Likelihood Ratio & Bayes Tests

Last Lecture. Biostatistics Statistical Inference Lecture 19 Likelihood Ratio Test. Example of Hypothesis Testing.

A study on generalized absolute summability factors for a triangular matrix

On Certain Subclass of λ-bazilevič Functions of Type α + iµ

The Heisenberg Uncertainty Principle

On Inclusion Relation of Absolute Summability

1. Matrix Algebra and Linear Economic Models

Degenerate Perturbation Theory

Bessel function for complex variable

Solutions: Homework 3


Presentation of complex number in Cartesian and polar coordinate system

Ψηφιακή Επεξεργασία Εικόνας

IIT JEE (2013) (Trigonomtery 1) Solutions

Proof of Lemmas Lemma 1 Consider ξ nt = r

LAD Estimation for Time Series Models With Finite and Infinite Variance

Every set of first-order formulas is equivalent to an independent set

Partial Differential Equations in Biology The boundary element method. March 26, 2013

INTEGRATION OF THE NORMAL DISTRIBUTION CURVE

Μια εισαγωγή στα Μαθηματικά για Οικονομολόγους

Introduction of Numerical Analysis #03 TAGAMI, Daisuke (IMI, Kyushu University)

Supplemental Material: Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction

Congruence Classes of Invertible Matrices of Order 3 over F 2

Supplementary Materials: Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent

n r f ( n-r ) () x g () r () x (1.1) = Σ g() x = Σ n f < -n+ r> g () r -n + r dx r dx n + ( -n,m) dx -n n+1 1 -n -1 + ( -n,n+1)

4.6 Autoregressive Moving Average Model ARMA(1,1)

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

COMMON RANDOM FIXED POINT THEOREMS IN SYMMETRIC SPACES

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutions to Problems on Matrix Algebra

Biorthogonal Wavelets and Filter Banks via PFFS. Multiresolution Analysis (MRA) subspaces V j, and wavelet subspaces W j. f X n f, τ n φ τ n φ.

Three Classical Tests; Wald, LM(Score), and LR tests

ST5224: Advanced Statistical Theory II

The Equivalence Theorem in Optimal Design

Homework 4.1 Solutions Math 5110/6830

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

2 Composition. Invertible Mappings

Binet Type Formula For The Sequence of Tetranacci Numbers by Alternate Methods

C.S. 430 Assignment 6, Sample Solutions

Example Sheet 3 Solutions

Research Article Finite-Step Relaxed Hybrid Steepest-Descent Methods for Variational Inequalities

Supplement to A theoretical framework for Bayesian nonparametric regression: random series and rates of contraction

Dimension-free PAC-Bayesian bounds for matrices, vectors, and linear least squares regression.

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

ANOTHER EXTENSION OF VAN DER CORPUT S INEQUALITY. Gabriel STAN 1

Lecture 3: Asymptotic Normality of M-estimators

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

The Simply Typed Lambda Calculus

Solve the difference equation

Section 8.3 Trigonometric Equations

Statistical Inference I Locally most powerful tests

p n r

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

Homework 3 Solutions

SUPPLEMENT TO ROBUSTNESS, INFINITESIMAL NEIGHBORHOODS, AND MOMENT RESTRICTIONS (Econometrica, Vol. 81, No. 3, May 2013, )

6.3 Forecasting ARMA processes

B.A. (PROGRAMME) 1 YEAR

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

= λ 1 1 e. = λ 1 =12. has the properties e 1. e 3,V(Y

Srednicki Chapter 55

A Note on Intuitionistic Fuzzy. Equivalence Relation

Math221: HW# 1 solutions

A New Class of Analytic p-valent Functions with Negative Coefficients and Fractional Calculus Operators

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

On a four-dimensional hyperbolic manifold with finite volume

EE512: Error Control Coding

Uniform Convergence of Fourier Series Michael Taylor

J. of Math. (PRC) Shannon-McMillan, , McMillan [2] Breiman [3] , Algoet Cover [10] AEP. P (X n m = x n m) = p m,n (x n m) > 0, x i X, 0 m i n. (1.

FREE VIBRATION OF A SINGLE-DEGREE-OF-FREEDOM SYSTEM Revision B

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

w o = R 1 p. (1) R = p =. = 1

Supplemental Material to Comparison of inferential methods in partially identified models in terms of error in coverage probability

Matrices and Determinants

Uniform Estimates for Distributions of the Sum of i.i.d. Random Variables with Fat Tail in the Threshold Case

Areas and Lengths in Polar Coordinates

Problem Set 3: Solutions

derivation of the Laplacian from rectangular to spherical coordinates

α β

true value θ. Fisher information is meaningful for families of distribution which are regular: W (x) f(x θ)dx

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Concrete Mathematics Exercises from 30 September 2016

Areas and Lengths in Polar Coordinates

Approximation of distance between locations on earth given by latitude and longitude

Lecture 21: Properties and robustness of LSE

Solutions to Exercise Sheet 5

Finite Field Problems: Solutions

Μηχανική Μάθηση Hypothesis Testing

Probability theory STATISTICAL MODELING OF MULTIVARIATE EXTREMES, FMSN15/MASM23 TABLE OF FORMULÆ. Basic probability theory

Outline. Detection Theory. Background. Background (Cont.)

CHAPTER 103 EVEN AND ODD FUNCTIONS AND HALF-RANGE FOURIER SERIES

The Neutrix Product of the Distributions r. x λ

Inertial Navigation Mechanization and Error Equations

Tridiagonal matrices. Gérard MEURANT. October, 2008

Solution Series 9. i=1 x i and i=1 x i.

Data Dependence of New Iterative Schemes

Transcript:

Adaptive Covariace Estimatio with model selectio Rolado Biscay, Hélèe Lescorel ad Jea-Michel Loubes arxiv:03007v [mathst Mar 0 Abstract We provide i this paper a fully adaptive pealized procedure to select a covariace amog a collectio of models observig iid replicatios of the process at fixed observatio poits For this we geeralize the results of [3 ad propose to use a data drive pealty to obtai a oracle iequality for the estimator We prove that this method is a extesio to the matricial regressio model of the work by Baraud i [ Keywords: covariace estimatio, model selectio, adaptive procedure Itroductio Estimatig the covariace fuctio of stochastic processes is a fudametal issue i statistics with may applicatios, ragig from geostatistics, fiacial series or epidemiology for istace we refer to [0, [8 or [5 for geeral refereces While parametric methods have bee extesively studied i the statistical literature see [5 for a review, oparametric procedures have oly recetly received attetio, see for istace [6, 3, 4, ad refereces therei I [3, a model selectio procedure is proposed to costruct a o parametric estimator of the covariace fuctio of a stochastic process uder mild assumptios However their method heavily relies o a prior kowledge of the variace I this paper, we exted this procedure ad propose a fully data drive pealty which leads to select the best covariace amog a collectio of models This result costitutes a geeralizatio to the matricial regressio model of the selectio methodology provided i [ Cosider a stochastic process X t t T takig its values i R ad idexed by T R d, d N We assume that E [X t = 0 t T ad we aim at estimatig its covariace fuctio σ s, t = E [X s X t < for all t, s T We assume we observe X i t j where i { } ad j { p} Note that the observatio poits t j are fixed ad that the X i s are idepedet copies of the process X Set x i = X i t,, X i t p i { } ad deote by Σ the covariace matrix of X at the observatios poits Σ =E x i x i = σ t j, t k j p, k p Followig the methodology preseted i [3, we approximate the process X by its projectio oto some fiite dimesioal model For this, cosider a coutable set of fuctios g λ λ Λ which may be for istace a basis of L T ad choose a collectio of models M P Λ For m M, a fiite umber of idices, the process ca be

approximated by X t λ m a λ g λ t Such a approximatio leads to a estimator which depeds o the collectio of fuctios m, deoted by ˆΣ m Our objective is to select i a data drive way, the best model, ie the oe close to a oracle m 0 defied as the miimizer of the quadratic risk, amely [ Σ m 0 arg mir m = arg mie ˆΣm This result is achieved usig a model selectio procedure The paper falls ito the followig parts The descriptio of the statistical framework of the matrix regressio is give i Sectio Sectio 3 is devoted to the mai statistical results Namely we recall the results of the estimate give i [3 ad prove a oracle iequality with a fully data drive pealty Sectio 4 states techical results which are used i all the paper, while the proofs are postpoed to the Appedix Statistical model ad otatios We cosider a R-valued process X t idexed by T a subset of R d with expectatio equal to 0 We are iterested i its covariace fuctio deoted by σ s, t = E [X s X t We have at had the observatios x i = X i t,, X i t p for i where X i are idepedet copies of the process ad t j are determiistic poits We ote Σ R p p the covariace matrix of the vector x i Hece we observe x i x i = Σ + U i, i where U i are iid error matrices with expectatio 0 We deote by S the empirical covariace of the sample : S = i= x ix i We use the Frobeius orm defied by A = Tr AA for all matrix A Recall that for a give matrix A R p q, veca is the vector i R pq obtaied by stackig the colums of A o top of oe aother We deote by A the reflexive geeralized iverse of the matrix A, see for istace i [9 or [7 The idea is to cosider that we have a quite good approximatio of the process i the followig form X t a λ g λ t, λ m where m is a fiite subset of a coutable set Λ, a λ λ Λ are radom coefficiets i R ad g λ λ Λ are real valued fuctios We will cosider models m amog a fiite collectio deoted by M We ote G m R p m where G m jλ = g λ t j ad a m the radom vector of R m with coefficiets a λ λ m Hece, we obtai the followig approximatios : x = X t,, X t p G m a m

xx G m a m a mg m Σ G m E [ a m a m G m Thus, this poit of view leads us to approximate Σ by a matrix i the subset S G m = { G m ΨG m/ψ symmetric i R m m } R p p 3 Hece, for a model m, a atural estimator for Σ is give by the projectio of S oto S G m We ca prove usig stadard algebra see i [3 for a geeral proof that it has the followig form : Σ m = Π m SΠ m m M R p p, 4 where are orthogoal projectio matrices Set Π m = G m G m G m G m R p p 5 = T r Π m Π m which is the dimesio of S G m assumed to be positive, ad Σ m = Π m ΣΠ m the projectio of Σ oto this subspace Hece we obtai the model selectio procedure defied i [3 The estimatio error for a model m M is give by Σ E Σm = Σ Π m ΣΠ m + δ m, 6 where δm = Tr Π m Π m Φ, Φ=V vec x x by where Give θ > 0, it is thus atural to defie the pealized covariace estimator Σ = Σ m { m = arg mi } x i x i Σ m + pe m, i= pe m = + θ δ m 7 The followig result proved i [3 states a oracle iequality for the estimator Σ Theorem Let q > 0 be give such that there exists > + q satisfyig E x x < The, for some costats K θ > ad C θ,, q > 0 we have that E Σ Σ q /q q + [K θ if Σ Π m ΣΠ m + δ m + δ sup, 3

where ad q = C θ,, q E x x δm Dm / q δ sup = max { δ m : m M } However the pealty defied here depeds o the quatity δ m which is ukow i practice sice it relies o the matrix Φ = V vec xx Our objective is to study a covariace estimator built with a ew pealty ivolvig a estimator of Φ More precisely, we will replace pem by a empirical versio pem, where ad pe m = + θ δ m, 8 δ m = Tr Π m Π m Φ, with Φ a estimator of Φ The objective is to geeralize Theorem ad to costruct a fully adaptive pealized procedure to estimate the covariace fuctio 3 Mai result : adaptive pealized covariace estimatio Here we state the oracle iequality obtaied for the ew covariace estimator itroduced previously Set y i = vec x i x i, i, which are vectors i R p ad deote by S vec = i= y i their empirical mea Cosider the followig costat C if = if Tr Π m Π m Φ, ad assume that the collectio of models is chose such that C if > 0 Set Φ = δ m = Tr i= yi yi S vec Svec, Π m Π m Φ 4

Give θ > 0, we cosider the covariace estimator Σ = Σ m with { } m = arg mi x i x i Σ m + pe m, where i= pe m = + θ δ m 9 Theorem 3 Let q > 0 be give such that there exists β > max + q, 3 + q satisfyig E xx β < The, for a costat C depedig o θ, β ad q, we have for β, θ, C if, Σ, ad + q ; mi β, β 4[ : [ Σ E Σ q /q C if Σ Σ m + δ m + C [ [ [ xx β E β β + Σ + δ sup 0 where [ xx q β E = c θ, β, q β q = C θ,, q E xx δm β Dm β/ q δm Dm / q ad δ sup = max { δ m : m M } We have obtaied i Theorem 3 a oracle iequality sice the estimator Σ has the same quadratic risk as the oracle estimator except for a additive term of order O ad a costat factor Hece, the selectio procedure is optimal i the sese that it behaves as if the true model were at had The proof of this theorem is divided ito two parts First, as i the of Theorem proved i [3, we will cosider a vectorized versio of the model I this techical part we will obtai a oracle iequality uder some particular assumptios for a geeral pealty I a secod part, we will prove that our particular pealty verifies these assumptios by usig properties of the estimator Φ 4 Techical results 4 Vectorized model Here we cosider the vectorized versio of model I this case, we observe the followig vectors i R p : y i = f i + ε i i 5

Here y i correspods to vec x i x i i the model, fi to vect Σ ad ε i to vec U i We set f = f,, f, y = y,, y ad ε = ε,,, ε which are vectors i R p We estimate f by a estimator of the form f m = P m y m M, where P m is the orthogoal projectio oto a subspace S m of dimesio We ote f m = P m f ad we cosider the empirical orm f = i= f i f i with the correspodig scalar product, First we state the vectorized form of Theorem Write δ m = Tr P m I Φ, δ sup = max { δ m : m M } Give θ > 0, defie the pealized estimator f = f m, where { y } m = arg mi fm + pe m, with pe m = + θ δ m The, the proof of Theorem relies o the followig propositio proved i [3: Propositio 4 : Let q > 0 be give such that there exists > + q satisfyig E ε < The, for some costats K θ > ad C θ,, q > 0 we have that E f f q /q [ q + K θ if f P m f + δ m + δ sup, 3 where q = C θ,, q E ε δm Dm / q The ew estimator Σ defied previously correspods here to the estimator f = f m, where { y } m = arg mi fm + pe m, with pe m = + θ δ m, ad δ m is some estimator of δ m Next Propositio gives a oracle iequality for this estimator uder ew assumptios o the model As Propositio 4, it is ispired by the paper [ 6

Propositio 4 Let q > 0 be give such that there exists > + q satisfyig E ε < } For α 0; [, set Ω = { δ m α δm Assume that A E [ δ m δm A P Ω c C α γ for some γ q q/ The, for a costat C depedig o, θ ad q, ad we have where ad [ f E f q = q /q C if f P m f + δ m [ [ + C E [ ε + f + δ sup C α q with α = α θ is fixed i 0; [ q = C θ,, q E ε δ m D / q m 4 5 Theorem 3 is thus a direct applicatio of Propositio 4 Hece oly remai to be checked the two assumptios A ad A 4 Auxiliary cocetratio type lemmas Here we state some propositios required i the proofs of the previous results To our kowledge, the first is due to vo Bahr ad Essee i [ Lemma 43 Let U,, U idepedet cetred variables with values i R we have : [ E U i 8 E [ U i i= The ext propositio is proved i [3 i= For ay Propositio 44 Give N, k N, let à R Nk Nk {0} be a o-egative defiite ad symmetric matrix ad ε,, ε N iid radom vectors i R k with E ε = 0 ad V ε = Φ Write ε =, ε,, ε N ζ ε = ε Ãε, ad δ = TrÃI N Φ Trà For all β such that E ε β < it holds that, for all x > 0, E ε P ζ ε δ Tr à + δ Tr à ρ à x + δ ρ à β Tr à x C β, δ β ρ à x β/ where the costat C β depeds oly o β 7 6

5 Appedix 5 Proof of Propositio 4 This proof follows the guidelies of the proof of Theorem 6 i [ The followig lemma will be helpful for the proof of this propositio Lemma 5 Choose η { = η θ > 0 ad α = α θ 0; [ such that + θ α f + η Set H m f = f [ θ f f m + Dm δ } m where θ = The, for m 0 miimizig m f f m + Dm δ m i m M where was defied i Propositio 4 E [H m0 f q Ω q δ q Proof Lemma 5 First, remark that o the set Ω, for all m M sup + + 4 η 7 q + θ pe m α + θ δ m + η δ m Set pem = + η δ m Dm, which correspods to the pealty of Propositio 4 The proof of this lemma is based o the proof of Propositio 4 i [3 I fact, it is sufficiet to prove that for each x > 0 ad P H f Ω where we have set H f = Ideed, for each m M, + η x δ m c, η E ε [ f f + 4 { f fm0 η + pe m 0 } + δ m, 8 / η + x f f m0 + pe m 0 = f f m0 + + θ δ m 0 0 the we get that for all q > 0, Usig the equality E [H q f Ω = + θ f f m0 + δ m 0 0 H q f Ω H q m 0 f Ω 9 0 qu q P H q f Ω > u du 8

ad followig the proof of Proposito 4 i [3 we obtai the upper boud 7 of Lemma 5 Now we tur to the proof of 8 For ay g R p we defie the empirical quadratic loss fuctio by γ g = y g Usig the defiitio of γ we have that for all g R p, f g = γ g + g y, ε + ε ad therefore f f f P m0 f = γ f Usig the defiitio of f, we kow that γ f + pe m γ g + pe m 0 γ P m0 f + f Pm0 f, ε 0 for all g S m0 The γ f γ P m0 f pe m 0 pe m So we get from 0 ad that f f f P m0 f + pe m 0 pe m + f P m0 f, ε + P m f f, ε + f P m f, ε I the followig we set for each m M, B m = {g S m : g }, G m = sup t B m u m = Sice f = P m f+ P m ε, gives g, ε = P m ε, { Pm f f P m f f if P m f f 0 0 otherwise f f f P m0 f + pe m 0 pe m + f P m0 f u m0, ε + f P m f u m, ε + G m 3 Usig repeatedly the followig elemetary iequality that holds for all positive umbers ν, x, z xz νx + ν z 4 9

we get for ay m M f P m f u m, ε ν f P m f + ν u m, ε 5 By Pythagora s Theorem we have f f = f P m f P + m f f = f P m f + G m 6 We derive from 3 ad 5 that for ay ν > 0 f f f P m0 f + ν f P m 0 f + ν u m 0, ε +ν f P m f + ν u m, ε + G m + pe m 0 pe m Now takig ito accout that by equatio 6 f P m f f = f G m the above iequality is equivalet to ν f f + ν f P m0 f + ν u m 0, ε + ν u m, ε + ν G m + pe m 0 pe m 7 We choose ν = 0, [, but for sake of simplicity we keep usig the otatio ν Let +η p ad p be two fuctios depedig o ν mappig M ito R + They will be specified as i [3 to satisfy pe m ν p m + ν p m m M 8 Remember that o Ω, pem pem m M Sice p ν m pe m ad + ν, we get from 7 ad 8 that o the set Ω ν f f + ν f P m0 f + pe m 0 + ν p m 0 + ν G m p m + u m, ε ν p m + um0, ε ν p m 0 f P m0 f + pe m 0 + ν G m p m + u m, ε ν p m + um0, ε ν p m 0 9 As = + 4 we obtai that ν η { ν H f Ω = ν f f ν + 4 f Pm0f η + pe m 0 } Ω + { = ν f f f P m0 f + pe m 0 } Ω + { ν G m p m + u m, ε ν p m + um, ε ν p m 0 } + 0

For ay x > 0, P ν H f Ω xδ m P m M : ν G m p m xδ m 3 + P m M : um, ε ν p m xδ m 3 P ν P m ε p m xδ m 3 m M + P um, ε ν p m xδ m 3 m M := P,m x + P,m x 30 m M m M From ow o, the proof of Lemma 5 is exactly the same as the ed of the proof of Propositio 4 i [3 with L m = ν Proof Propositio 4 [ f We first provide a upper boud for E f q Ω, where the set Ω depeds o α chose as i Lemma 5 As q, we have a + b q a q + b q Together with Lemma 5 we deduce that [ f E f q Ω q δ q Usig the covexity of x x q [ f E f q sup + E q [ [ θ q f f m0 + D q m 0 δ m 0 together with the Jese iequality, we obtai [ [ q Ω /q δsup + /q E θ f f m0 + D m 0 δ m 0, ad by usig the assumptio A we have that [ f E f q [ q Ω /q δsup + /q θ f f m0 + D m 0 δ m 0 3 [ f Now we eed to fid a upper boud for the quatity E f First, remark that f f = f P m y = f P mf + P m f y q Ω c f P m f + ε = f P mf + ε Ad thus f f f + ε

So we have [ f E f q Ω c f q P Ωc + E [ ε q Ω c Usig Hölder s iequality with q > we obtai E [ ε q Ω c E [ ε q P Ω c q But E [ ε = E i= ε i, ad as, we ca use Mikowsky s iequality to obtai E [ ε E [ ε i i= = E [ ε, that is So we have [ f E f q Ω c E [ ε E [ ε [E [ ε q + f q P Ω c q, ad with assumptio A As γ [ f E f q Ω c q, we deduce that q/ [E [ ε q q + f q Cα γ [ f E f q Ω c q q [ E [ ε + f Cα q q 3 To coclude, we use agai the covexity of x x q ad the iequality 3 to get [ f E f q q 4 q [ E [ ε + f +4 /q δsup +4 /q [ θ f f m0 + D m 0 δ m 0 Cα q q

5 Proof of Theorem 3 Recall that β > max + q, 3 + q ad + q ; mi β, β 4[ I order to use Propositio 4, we eed to prove the followig iequalities : A E [ δ m δm A P Ω c C α γ for γ q q/ First we prove A Remember that δ m = TrΠm Πm Φ By usig the liearity of the trace ad the equality E [ Φ = [ δ Φ, we obtai that E m = δ m which proves the result } For the secod, write Ω c = { δ m α δm We boud up the quatity P δ m α δm i the followig Propositio Propositio 5 For all m M, α 0; [ ad, β, α, C if, Σ we have for some costats C β, C β : P δ m α δm C γ β β+ + C β [ xx E β δ α β m β D β m, for γ q q/ This Propositio cocludes the proof of A with C α = C β β+ + C β [ xx E β α β δ β m D β m Proof Propositio 5 We start by dividig P m = P δ m α δm ito two parts with oe of them ivolvig a sum of idepedet variables with expectatio equal to 0 P m = P Tr Π m Π m Φ α Tr Π m Π m Φ P m = P Tr Π m Π m Φ Φ + µµ + µµ αtr Π m Π m Φ P m P Tr P m P Tr Π m Π m Π m Π m yi yi T Φ µµ + µµ S vec Svec αtr Π m Π m Φ i= yi yi T Φ µµ α Tr Π m Π m Φ i= 3

ad Set Q = P Tr +P Πm Π m µµ S vec Svec α Tr Π m Π m Φ Tr Π m Π m yi yi T Φ µµ α Tr Π m Π m Φ i= Tr Q = P Πm Π m µµ S vec Svec α Tr Π m Π m Φ Study of Q First we use Markov s iequality to obtai A [ Tr β E Πm Π m i= yi yi T Φ µµ β αtr Π m Π m Φ β We must cosider the two followig cases : If β, Rosethal s iequality gives +C E Tr Π m Π m y i yi T i= β Φ µµ [ Tr Πm Π m y y Φ µµ β β C E β β [ Tr E Πm Π m y y Φ µµ β 4 As β, obtai β β 4 C E β ad we ca use Jese s iequality o the secod term to Tr Π m Π m y i yi T i= E β Φ µµ [ Tr Πm Π m y y Φ µµ β β 4 If β, we use Lemma 43 of subsectio 4 to get E Tr Π m Π m y i yi T i= 8 β E β Φ µµ [ Tr Πm Π m y y Φ µµ β 4

I both cases, we ca use the fact that x x β is a covex ad icreasig fuctio to obtai [ Tr E Πm Π m y y Φ µµ β β [E [ Tr Πm Π m y y β + Tr Πm Π m Φ + µµ β Ad by usig the Jese s iequality o the secod term we have that [ E Tr Π m Π m y y Φ µµ β [ Tr β E Πm Π m y y β Now cosider the followig lemma Lemma 53 If Ψ is symmetric o-egative defiite, the From this fact we get that Tr Π m Π m Ψ [0; Tr Ψ 33 Tr Πm Π m y y β Tr y y β = y β = xx β I coclusio, we have Q C β [ xx E β α β δmd β β, 34 γ m with γ = mi β 4, β ad C β = C β if β 4 where C β is the costat i Rosethal s iequality ad C β = 8 if β 4 Remark that β 4 4 ad β 4, so γ 4 Set Study of Q Recall that Q = P Tr Πm Π m µµ S vec Svec α Tr Π m Π m Φ B = Tr Π m Π m µµ S vec S vec Usig the properties of the trace, we ca write B = Tr Π m Π m µµ Tr Π m Π m S vec S vec = Tr µ Π m Π m µ Tr S vec Π m Π m S vec But Π m Π m is a orthogoal projectio matrix, the 5

B = Tr µ Π m Π m Π m Π m µ Tr S vec Π m Π m Π m Π m S vec B = Π m Π m µ Π m Π m S vec B = Π m Π m µ Π m Π m S vec Π m Π m µ + Π m Π m S vec Hece B Π m Π m µ S vec Π m Π m µ + Π m Π m S vec B Π m Π m µ S vec + Π m Π m µ S vec Π m Π m µ B Π m Π m µ S vec + Π m Π m µ S vec µ Fially Q P +P Π m Π m µ S vec α 4 Tr Π m Π m Φ Π m Π m µ S vec α 8 µ Tr Π m Π m Φ 35 Now we eed to provide a upper boud for the quatities P Π m Π m µ S vec t For this we will use the deviatio boud provided by Propositio 44 stated i subsectio 4 Set Id p Id p G = Id p Id p Rp p Id p Id p The G y f = S vec µ Now, if Π m Π m 0 0 0 Π m Π m 0 H m = Id Π m Π m = p, Rp 0 0 Π m Π m we have H m S vec µ = Π m Π m S vec µ 6

I coclusio, with A m = H m G = Π m Π m Π m Π m Π m Π m Π m Π m Π m Π m Π m Π m p Rp, we have that A m y f = Π m Π m S vec µ Moreover, A m is a orthogoal projectio matrix ad we have the followig equalities A m y f = Π m Π m S vec µ = y f A m y f, Tr A m = Tr Π m Π m =, Tr A m Id Φ = Tr Π m Π m Φ = Tr Π m Π m Φ Now we ca use Propositio 44 with à = A m, ε i = y i µ, Tr A m =, ρ A m =, δ = δm ad β This gives for all x > 0 P y f A m y f Tr Π m Π m Φ ad that is [ + x [ P Π m Π m S vec µ Tr Π m Π m Φ + x I order to use this deviatio boud to obtai the iequalities with γ m M C β E[ y µ β β D + m, TrΠ β m Π mφ x β P Π m Π m µ S vec α 4 Tr Π m Π m Φ C γ P Π m Π m µ S vec α 8 µ Tr Π m Π m Φ C γ q, we eed to fid x > 0 satisfyig the three followig facts q/ m M α 4 C β E[ y µ β β D + m TrΠ β m Π mφ x β x + 36 α α Tr Π m Π m Φ C if x + 37 8 µ 8 µ D β + m Tr Π m Π m Φ β x β 7 = δ β mx β C γ 38

36 ad 37 hold for the choice x = r with r < ad if is large eough to have + α 4 39 ad + α C if 8 µ 40 I order to obtai 38 with x = r, we use the iequality which gives δ β mx β δ β md β m rβ/ Moreover [ E y µ β [ E y + µ β, ad by usig properties of covexity we obtai E [ y µ β [ β E y β + µ β With the Jese s iequality we get: [ E y µ β [ β E y β I coclusio, with r = + β < we obtai for, β, α, C if, Σ E Q β+ C β [ xx β δ β md β m /4 4 where C β is the costat which appears i Propositio 44 I coclusio, combiig 34 ad 4 P δ m α δm C /4 β β+ + C β [ xx E β δ α β m β D β m for, β, α, C if, Σ To coclude, remark that q q/ = > +q q as q 4 4 4 Proof Lemma 53 Recall that Π m Π m is a orthogoal projectio matrix Hece there exists a orthogoal matrix P m such that P m Π m Π m P m = D, with D a diagoal matrix with D ii = if i, ad D ii = 0 otherwise The if Ψ is symmetric o-egative defiite we have : Tr Π m Π m Ψ = Tr DP mψp m 8

p p = D kl P m ΨP m l= k= = l= P m ΨP m kl = p ll D ll P m ΨP m ll l= [0; Tr Ψ Ideed, P mψp m is o-egative defiite so all its diagoal etries are o-egative Refereces [ Y Baraud Model selectio for regressio o a fixed desig Probability theory related fields, 74:467 493, 000 [ J Bigot, R Biscay, J-M Loubes, ad L M Alvarez Group lasso estimatio of high-dimesioal covariace matrices Joural of Machie Learig Resarch, 0 [3 J Bigot, R Biscay, J-M Loubes, ad L Muñiz-Alvarez Noparametric estimatio of covariace fuctios by model selectio Electro J Stat, 4:8 855, 00 [4 J Bigot, R Biscay Lirio, J-M Loubes, ad L Muiz Alvarez Adaptive estimatio of spectral desities via wavelet thresholdig ad iformatio projectio preprit hal-0044044, May 00 [5 N A C Cressie Statistics for spatial data Wiley Series i Probability ad Mathematical Statistics: Applied Probability ad Statistics Joh Wiley & Sos Ic, New York, 993 Revised reprit of the 99 editio, A Wiley-Itersciece Publicatio [6 S N Eloge, O Perri, ad C Thomas-Aga No parametric estimatio of smooth statioary covariace fuctios by iterpolatio methods Stat Iferece Stoch Process, :77 05, 008 [7 H Egl, M Hake, ad A Neubauer Regularizatio of iverse problems, volume 375 of Mathematics ad its Applicatios Kluwer Academic Publishers Group, Dordrecht, 996 [8 A G Jourel Krigig i terms of projectios J Iterat Assoc Mathematical Geol, 96:563 586, 977 [9 G A F Seber A matrix hadbook for statisticias Wiley Series i Probability ad Statistics Wiley-Itersciece [Joh Wiley & Sos, Hoboke, NJ, 008 [0 M L Stei Iterpolatio of spatial data Spriger Series i Statistics Spriger- Verlag, New York, 999 Some theory for Krigig [ B vo Bahr ad C-G Essee Iequalities for the rth absolute momet of a sum of radom variables, r A Math Statist, 36:99 303, 965 9