Dimension-free PAC-Bayesian bounds for matrices, vectors, and linear least squares regression.

Μέγεθος: px
Εμφάνιση ξεκινά από τη σελίδα:

Download "Dimension-free PAC-Bayesian bounds for matrices, vectors, and linear least squares regression."

Transcript

1 Dimesio-free PAC-Bayesia bouds for matrices vectors ad liear least squares regressio Olivier Catoi ad Ilaria Giulii December Abstract: This paper is focused o dimesio-free PAC-Bayesia bouds uder weak polyomial momet assumptios allowig for heavy tailed sample distributios It covers the estimatio of the mea of a vector or a matrix with applicatios to least squares liear regressio Special efforts are devoted to the estimatio of the Gram matrix due to its promiet role i high-dimesio data aalysis Key words: PAC-Bayesia bouds sub-gaussia mea estimator radom vector radom matrix least squares regressio dimesio-free bouds MSC2010: 62J10 62J05 62H12 62H20 62F35 15B52 1 Itroductio The subject of this paper is to discuss dimesio-free PAC- Bayesia bouds for matrices ad vectors It comes after Catoi 2016 ad Giulii 2017a the first paper discussig dimesio depedet bouds ad the secod oe dimesio-free bouds uder a kurtosis like assumptio about the data distributio Here i cotrast we evisio eve weaker assumptios ad focus o dimesio-free bouds oly Our mai objective is the estimatio of the mea of a radom vector ad of a radom matrix Fidig sub-gaussia estimators for the mea of a o ecessarily sub-gaussia radom vector has bee the subject of much research i the last few years with importat cotributios from Joly Lugosi ad Oliveira 2017 Lugosi ad Medelso 2017 ad Misker 2015 While i Joly Lugosi ad Oliveira 2017 the statistical error boud still has a residual depedece o the dimesio of the ambiet space i Lugosi ad Medelso 2017 this depedece is removed for a estimator of the media of meas type However this estimator is ot easy to compute ad the boud cotais large costats We propose here aother type of estimator that ca be see as a multidimesioal extesio of Catoi 2012 It provides a oasymptotic cofidece regio with the same diameter icludig the values of the costats as the Gaussia cocetratio iequality stated i equatio 11 of Lugosi ad Medelso 2017 although i our case the cofidece regio is ot ecessarily a ball but still a covex set The Gaussia boud cocers the CREST CNRS UMR 9194 Uiversité Paris Saclay Frace; oliviercatoi@esaefr Laboratoire de Probabilités et Modèles Aléatoires Uiversité Paris Diderot Frace; giulii@mathuiv-paris-diderotfr 1

2 estimatio of the expectatio of a Gaussia radom vector by the mea of a iid sample whereas i our case we oly assume that the variace is fiite a much weaker hypothesis I Misker 2016 the questio of estimatig the mea of a radom matrix is addressed The author uses expoetial matrix iequalities i order to exted Catoi 2012 to matrices ad to cotrol the operator orm of the error I the bouds at cofidece level 1 δ the complexity term is multiplied by logδ 1 Here we exted Catoi 2012 usig PAC-Bayesia bouds to measure complexity ad defie a estimator with a boud where the term logδ 1 is multiplied by some directioal variace term oly ad ot the complexity factor that is larger After recallig i Sectio 2 the PAC-Bayesia iequality that will be at the heart of may of our proofs we deal successively with the estimatio of a radom vector Sectio 3 ad of a radom matrix Sectio 4 Sectio 6 is devoted to the estimatio of the Gram matrix due to its promiet role i multidimesioal data aalysis I Sectio 7 we itroduce some applicatios to least squares regressio 2 Some well kow PAC-Bayesia iequality This is a prelimiary sectio where we state the PAC-Bayesia iequality that we will use throughout this paper to obtai deviatio iequalities holdig uiformly with respect to some parameter Cosider a radom variable X X ad a measurable parameter space Θ Let µ M 1 +Θ be a probability measure o Θ ad f : Θ X R a bouded measurable fuctio For ay other probability measure ρ o Θ defie the Kullback divergece fuctio Kρ µ as usual by the formula dρ log dρ ρ µ Kρ µ = dµ + otherwise Let X 1 X be idepedet copies of X Propositio 21 For ay δ ]01[ with probability at least 1 δ for ay probability measure ρ M 1 +Θ 1 f θ X i dρθ log [ E exp f θ X ] Kρ µ + logδ 1 dρθ + Proof It is a cosequece of equatio 521 page 159 of Catoi 2004 Ideed let us recall the idetity log exp hθ dµθ { = sup ρ 2 } hθ dρθ Kρ µ

3 where h may be ay bouded measurable fuctio extesios to ubouded h are possible but will ot be required i this paper ad where the supremum i ρ is take o all probability measures o the measurable parameter space Θ The proof may be foud i Catoi 2004 page 159 Combied with Fubii s lemma it yields { [ E exp sup ρ { = E = f θ X i log [ E exp f θ X ] ]} dρθ Kρ µ exp f θ X i log [ E exp f θ X ] } dµθ E exp f θ X i log [ E exp f θ X ] dµθ Sice EexpW 1 implies that = [ E exp f θ Xi ] E exp f θ X dµθ = 1 P W logδ 1 = E 1 [ δ expw 1 ] E δ expw δ we obtai the desired result cosiderig [ W = sup f θ X i log [ E exp f θ X ] ] dρθ Kρ µ ρ 3 Estimatio of the mea of a radom vector Let X R d be a radom vector ad let X 1 X be idepedet copies of X I this sectio we will estimate the mea EX ad obtai dimesio-free o-asymptotic bouds for the estimatio error Let S d = { θ R d : θ = 1 } be the uit sphere of R d ad let I d be the idetity matrix of size d d Let ρ θ = N θ β 1 I d be the ormal distributio cetered at θ R d whose covariace matrix is β 1 I d where β is a positive real parameter Istead of estimatig directly the mea vector EX our strategy will be rather to estimate its compoet θex i each directio θ S d of the uit sphere For this we itroduce the estimator of θex defied as Eθ = 1 λ ψ λ θ X i dρ θ θ θ S d λ > 0 3

4 where ψ is the symmetric ifluece fuctio t t 3 /6 2 t 2 1 ψt = 2 2/3 t > 2 2 2/3 t < 2 ad where the positive costats λ ad β will be chose afterward As stated i the followig lemma we chose this ifluece fuctio because it is close to the idetity i a eighborhood of zero ad is such that exp ψt is bouded by polyomial fuctios Lemma 31 For ay t R log 1 t + t 2 /2 ψt log 1 + t + t 2 /2 Proof Put f t = log 1 + t + t 2 /2 Remark that f 1 + t t = 1 + t + t 2 /2 for t R ad that ψ t = 1 t 2 /2 for t [ 2 2] As ψ0 = f 0 = 0 ad provig that [ f t ψ t ] 1 + t + t 2 /2 = t3 2 t 4 ψ t f t 0 t 2 ψ t f t 2 t 0 ψt f t 2 t 2 Sice f is icreasig o [ 2+ [ ad decreasig o ] 2] while ψ is costat o these two itervals the above iequality ca be exteded to all t R From the symmetry ψ t = ψt we deduce the coverse iequality f t ψt t R that eds the proof Sice λ θ X i follows a ormal distributio with mea λ θ X i ad stadard deviatio λ β 1/2 X i ad sice the ifluece fuctio ψ is piecewise polyomial the estimator E ca be computed explicitly i terms of the stadard ormal distributio fuctio This is doe i the followig lemma Lemma 32 Let W N0 1 be a stadard Gaussia real valued radom variable For ay m R ad ay R + defie ϕm = E [ ψ m + W ] 4

5 The fuctio ϕ ca be computed as ϕm = m 1 2 /2 m 3 /6 + rm where itroducig Fa = PW a a R the correctio term r is rm = 2 [ m 2 m ] F F 3 m m 3 /6 [ 2 + m 2 m ] F + F 1 m 2 /2 + [exp m 2 exp 1 2 m 2 ] 2π 2 2 { + m2 2 m 2 + m F + F [ [ 2 + m exp m 2 ] + [ 2 m exp 1 2 m 2 ]} 2π 2 2 {[ m 2 ] [ + 2 exp 1 2 m 2 ] 2π 2 [ 2 + m 2 ] + 2 exp [ m 2 ]} Remark that the correctio term is small whe m is small ad is small sice { 1 F t mi t 2π 1 } exp t2 t R Proof The proof of this lemma is a simple computatio based o the expressio ψt = t t3 [1t ] 2 1t 2 6 o the idetities [ ] 1 1t 2 1t 2 t R 3 E [ 1W a ] = Fa E [ 1W aw ] = 1 exp a2 2π 2 5

6 E [ 1W aw 2] = Fa E [ 1W aw 3] = a π a 2π exp exp a2 2 a2 2 ad o the fact that F t = 1 Ft Accordigly the estimator E ca be computed as Eθ = 1 λ = 1 ϕ λ θ X i λ β 1/2 X i θ X i 1 λ2 X i 2 λ2 θ X i 3 2β 6 + r λ θ X i λ β 1/2 X i 31 Estimatio without ceterig Propositio 33 Assume that ad E X 2 = Tr [ E X X ] T < sup E θ X 2 v T < θ S where T ad v are two kow costats ad where S S d is a arbitrary symmetric subset of the uit sphere meaig that if θ S the θ S Choose ay cofidece parameter δ ]01[ ad set the costats λ ad β used i the defiitio of the estimator E to 2 logδ λ = 1 v β = 2T logδ T λ = 1 v No asymptotic cofidece regio: With probability at least 1 δ sup Eθ θex θ S T 2v logδ + 1 Cosider a estimator m R d of EX satisfyig sup Eθ θ m T + θ S 6 2v logδ 1

7 With probability at least 1 δ such a vector exists ad sup θ m EX sup Eθ θ m + θ S θ S T 2v logδ T + 2 2v logδ 1 Remark 31 I particular i the case whe S = S d is the whole uit sphere we obtai with probability at least 1 δ the boud T 2v logδ m EX = sup θ m EX 2 θ S d + 1 By choosig m as the middle of a diameter of the cofidece regio we could do a little better ad replace the factor 2 i this boud by a factor 3 Proof Accordig to the PAC-Bayesia iequality of Propositio 21 o page 2 with probability at least 1 δ for ay θ S Eθ 1 λ log [ E exp ψ λ θ X ] dρθ θ + Kρ θ ρ 0 + logδ 1 λ We ca the use the polyomial approximatio of expψt give by Lemma 31 o page 4 remarkig that Kρ θ ρ 0 = β/2 ad that log1 + z z to deduce that Eθ E θ X + λ E θ X 2 dρ θ θ + β + 2 logδ 1 2 2λ = E θ X + λ [ E θ X 2 + E X 2 ] + β + 2 logδ 1 2 β 2λ E θ X + λ β + 2 logδ 1 v + T/β + 2 2λ T 2v logδ = θex We coclude by cosiderig both θ S ad θ S to get the reverse iequality usig the assumptio that S is symmetric ad remarkig that E θ = Eθ The existece with probability 1 δ of m satisfyig the required iequality is grated by the fact that o the evet defied by the above PAC-Bayesia iequality the expectatio EX belogs to the cofidece regio that as a result caot be empty 7

8 32 Cetered estimate The bouds i the previous sectio are simple but they are stated i terms of ucetered momets of order two where we would have expected a variace I this sectio we explai how to deduce cetered bouds from the ucetered bouds of the previous sectio through the use of a sample splittig scheme Assume that E X EX 2 T < ad sup θ S d E θ X EX 2 v T < where v ad T are kow costats Remark that whe these bouds hold the bouds 2 v = v + EX 2 ad T = T + EX 2 hold i the previous sectio Assume that we kow also some boud b such that EX 2 b Split the sample i two parts X 1 X k ad X k+1 X Use the first part to costruct a estimator m of EX as described i Propositio 33 o page 6 choosig S = S d Accordig to this propositio ad by equatio 2 with probability at least 1 δ m EX 2 T + b 2v + b logδ k k where we have put A = 4 T + b + 2v + b logδ 1 2 = A k We the costruct a estimator Eθ of θex θ S d built as described i Propositio 33 based o the sample X k+1 m X m ad o the costats T + A/k ad v + A/k With probability at least 1 2δ sup Eθ θex B k = θ S d T + A/k 2 v + A/k logδ + 1 k k ad we ca if eeded deduce from Eθ a estimator m such that with probability at least 1 2δ m EX 2B k 8

9 If we wat the correctio term A/k to behave as a secod order term whe teds to we ca for example take k = i which case k is equivalet to at ifiity so that B is equivalet to T 2 v logδ + 1 Let us also metio that a simpler estimator obtaied by shrikig the orm of X i is also possible It comes with a sub-gaussia deviatio boud uder the slightly stroger hypothesis that E X p < for some o ecessarily iteger expoet p > 2 ad is described i Catoi ad Giulii Mea matrix estimate Let M R p q be a radom matrix ad let M 1 M be idepedet copies of M I this sectio we will provide a estimator for EM From the previous sectio we already have a estimator m of EM with a bouded Hilbert-Schmidt orm m EM HS sice from the poit of view of the Hilbert-Schmidt orm M is othig but a radom vector of size pq Here we will be iterested i aother atural orm the operator orm Ideed recallig that M = M = sup θ S q Mθ sup ξ Mθ = sup M ξ = sup Tr θξ M θ S q ξ S p ξ S p θ S q ξ S p we see that we ca deduce results from the previous sectio o vectors cosiderig the scalar product betwee matrices ad the part of the uit sphere defied as m EM 2 M N = Tr M N M N R p q S = { ξθ : ξ S p θ S q } Doig so we obtai i the ucetered case a boud of the form E M HS sup ξ S p θ S q E ξ Mθ 2 logδ 1 We will show i the ext sectio that the secod δ-depedet term is satisfactory whereas the first δ-idepedet term ca be improved 9

10 41 Estimatio without ceterig Cosider the ifluece fuctio ψ defied by equatio 1 o page 4 For ay ξ R p let ν ξ = N ξ β 1 I p where Ip is the idetity matrix of size p p I the same way let ρ θ = N θγ 1 I q θ R q Cosider the estimator of ξem θ defied as Eξθ = 1 λ ψ λ ξ M i θ dν ξ ξ dρ θ θ ξ R p θ R q Propositio 41 For ay parameters δ ]0 1[ λ β γ ]0 [ with probability at least 1 δ for ay ξ R p ad ay θ R q Eξθ E ξ Mθ λ [E ξ Mθ 2 + E Mθ 2 2 β + E M ξ 2 γ + E M HS 2 βγ + β + γ + 2 logδ 1 2λ Proof The PAC-Bayesia iequality of Propositio 21 o page 2 tells us that with probability at least 1 δ for ay ξ R p ad ay θ R q Eξθ λ 1 log { E [ exp ψ λ ξ Mθ ]} dνξ ξ dρ θ θ + Kν ξ ν 0 λ + Kρ θ ρ 0 λ + logδ 1 λ Usig the properties of ψ Lemma 31 o page 4 ad Fubii s lemma we get Eξθ ξemθ + λ 2 E ξ Mθ 2 dν ξ ξ dρ θ θ + β + γ + 2 logδ 1 2λ As ξ Mθ 2 dν ξ ξ dρ θ θ = ξ Mθ 2 + Mθ 2 β + M ξ 2 γ + M 2 HS βγ this cocludes the proof Let us ow discuss the questio of computig Eξ θ Remark that accordig to Lemma 32 o page 4 for ay x R p ψ ξ x dν ξ ξ = ϕ ξ x β 1/2 x 10

11 It is also easy to check that ξ x x 2 ξ x 3 = ξ x + r ξ x β 1/2 x 2β 6 M i θ 2 dρ θ θ = M i θ 2 + M i HS 2 γ ξ M i θ M i θ 2 dρ θ θ = ξ M i θ M i θ γ ξ M iθ M i 2 HS ad ξ M i θ 3 dρ θ θ = ξ M i θ γ ξ M iθ M i ξ γ ξ M i M i M i θ Cosider a stadard radom vector W q N0 I q We obtai that ψ λ ξ M i θ dν ξ ξ dρ θ θ = λ ξ M i θ λ3 ξ M i θ M i θ 2 2β λ3 6 ξ M iθ 3 + r λ ξ M i θ λ β 1/2 M i θ dρ θ θ so that Eξθ = 1 ξ M i θ λ2 6 ξ M iθ 3 λ2 2β ξ M iθ M i θ 2 λ2 2γ ξ M iθ M i ξ 2 λ2 2βγ ξ M iθ M i 2 HS λ2 βγ ξ M i M i M i θ + 1 λ E [ r λ M i ξθ + γ 1/2 W q λ β 1/2 M i θ + γ 1/2 W q ] The last term is ot explicit sice it cotais a expectatio but should be most of the time a small remider ad ca be evaluated usig a Mote-Carlo umerical scheme This gives a more explicit ad efficiet method tha evaluatig directly Eξ θ usig a Mote-Carlo simulatio for the couple of radom variables ξ θ ν ξ ρ θ 11

12 Propositio 42 Assume that the followig fiite bouds are kow v ad choose sup E ξ Mθ 2 = sup ξ ξ E M M θ θ ξ S p θ S q ξ S p θ S q t sup θ S q E Mθ 2 = sup θ S q θe M M θ = E M M u sup ξ S p E M ξ 2 = sup ξ S p ξe M M ξ = E M M T E M 2 HS λ = β + γ + 2 logδ 1 v + t/β + u/γ + T/ βγ For ay values of δ ]01[ βγ ]0 [ with probability at least 1 δ for ay ξ S p ay θ S q v Eξθ ξemθ t B = + β + u γ + T β + γ + 2 logδ 1 βγ Cosider ow ay estimator m of EM With probability at least 1 δ m EM I particular if we choose m such that sup Eξθ ξ m θ + B ξ S p θ S q sup Eξθ ξ m θ B ξ S p θ S q with probability at least 1 δ this choice is possible ad Remark 41 The boud B is of the type with a complexity or dimesio term C equal to m EM 2B { t + u T I particular choosig β = γ = 2 max v v { } 2v t + u T B 2 logδ max v v [ 2v C + logδ 1 ] { t + u T C = 4 max v v 12 } we get } + logδ 1

13 Remark 42 Let us evisio a simple case to compare the precisio of the bouds i a settig where dimesio-free ad dimesio-depedet bouds coicide Assume more specifically that the etries of the matrix M M i j 1 i p1 j q are cetered ad iid Assume that = EMi 2 j is kow ad take v = sup ξ S p θ S q E ξ Mθ 2 = 2 t = sup θ S q E Mθ 2 = p 2 u = sup ξ S p E M ξ 2 = q 2 T = E M 2 HS = pq 2 Choosig β = γ = 2p + q we get a complexity term equal to C = 4p + q + logδ 1 whereas the boud of the previous sectio made for vectors has a complexity factor equal to pq 42 Cotrollig both the operator orm error ad the Hilbert-Schmidt error There are situatios where it is desirable to cotrol both m EM ad m EM HS To do so we ca very easily combie Propositios 33 o page 6 ad Propositio 42 o page 11 sice these two propositios are based o the costructio of cofidece regios More precisely first cosider M R p q as a vector ad use the scalar product θ M HS = Tr θ M θ R p q Applyig Propositio 33 o page 6 we ca build a estimator E HS θ such that with probability at least 1 δ sup E HS θ Tr θ EM T 2v logδ A = + 1 θ R p q θ HS =1 O the other had we ca also apply Propositio 42 o page 11 ad build a estimator Eξθ ξ S p θ S q such that with probability at least 1 δ { } sup Eξθ ξemθ 2v t + u T B = 2 logδ ξ S p θ S q max v v 13

14 Propositio 43 Cosider a matrix m such that sup θ R p q θ HS =1 ad E HS θ Tr θ m A sup Eξθ ξ m θ B ξ S p θ S q Combiig Propositios 33 ad 42 shows that with probability at least 1 2δ such a matrix m exists ad satisfies both m EM HS 2A ad m EM 2B Remark that B is typically smaller tha A as expected i iterestig large dimesio situatios 43 Cetered estimator As already doe i the case of the estimatio of the mea of a radom vector we deduce i this sectio cetered bouds from the ucetered bouds of the previous sectios usig sample splittig Put m = EM ad M = M m Assume that we kow fiite costats vtut such that sup E ξ Mθ 2 v < ξ S p θ S q sup E Mθ 2 t < θ S q sup E M ξ 2 u < ξ S p E M 2 HS T < Whe this is true we ca take for the previous ucetered costats v = v + m 2 t = t + m 2 u = u + m 2 T = T + m 2 HS I view of this it is suitable to assume that we also kow some fiite costats b ad c such that m 2 b ad m 2 HS c As we see that the Hilbert-Schmidt orm m HS comes ito play we will use the combied prelimiary estimate provided by Propositio 43 Give a iid matrix sample M 1 M first use M 1 M k to build a prelimiary estimator m as described i Propositio 43 With probability at least 1 δ/2 m m HS A k ad m m 14 B k

15 2 where A = 4 2v + b log4/δ + T + c { t + u + 2b ad B = 8v + b 2 log4/δ + 4 max v + b T + c 1/2 } v + b The use the sample M k+1 m M m to build a estimator Eξθ ξ S p θ S q based o the costructio described i Propositio 42 o page 11 at cofidece level 1 δ/2 It is such that with probability at least 1 δ Eξθ ξm θ C k { 2v + B/k t + u + 2B/k = 2 log2/δ + 4 max k v + B/k If we choose for istace k = we obtai that C 2 v { t + u T 1/2 } 2 log2/δ + 4 max v v T + A/k 1/2 } v + B/k 5 Adaptive estimators The results preseted i the previous sectios assume that there exist kow upper bouds for some quatities as E X 2 i the case of a mea vector estimate or E M HS 2 i the matrix case Here we would like to adapt to these quatities i the case whe those bouds are ot kow To do so we will use a asymmetric ifluece fuctio ψ : R + R + defied o the positive real lie oly as 3 ψt = t t 2 /2 0 t 1 1/2 1 t Lemma 51 For ay t R + log1 t + t 2 ψt log1 + t Proof Let us put f t = log1 t + t 2 ad gt = log1 + t Remark that f 0 = g0 = ψ0 = 0 Remark also that for ay t [01] f t = 1 2t 1 t + t 2 ψ t f t = t2 2 t 1 t + t 2 0 ad g t = 1 t 1 + t ψ t = 1 t As o the iterval [1 [ f is decreasig g is icreasig ad ψ is costat this proves the lemma 15

16 Similarly to the previous case cosiderig a stadard Gaussia real valued radom variable W N01 we ca itroduce the fuctio ϕm = E { ψ [ m + W + ]} where t + = max { t 0 } ad explicitly compute ϕ as [ ϕm = m m m 1 F m/2 exp m2 2π 2 2 usig the expressio ψt + = t t 2 /2 [ 1t 1 1t 0 ] F m ] m 2 F 1 m 2 2π exp 1 m2 2 2 [ 1 1t 1 ] t R 51 Estimatio of the mea of a radom vector Cosider a discrete set Λ of values of λ ad a probability measure µ o Λ to be chose more precisely later o Let β be some positive parameter that we will also choose later ad put as previously ρ θ = Nθ β 1 I d Defie for ay θ S d 1 E + θ = sup λ Λ λ 1 E θ = sup λ Λ λ ad Eθ = E + θ E θ ψ λ θ X i + dρθ θ β + 2 log δ 1 µλ 1 2λ ψ λ θ X i dρθ θ β + 2 log δ 1 µλ 1 2λ Thoughtful readers may woder why we itroduce λ i this way ad do ot use istead ρ λθ to get a uiform result i λθ i oe shot without itroducig the discrete set Λ It is because this optio would produce the etropy factor λ β 2 istead β of requirig a value of β depedig o ukow momets of the distributio 2λ of X Accordig to the PAC-Bayesia iequality of Propositio 21 o page 2 with probability at least 1 2δ E θ X + dρθ θ { if λ E θ X 2 + dρθ θ + β + 2 log δ 1 µλ 1 } λ Λ λ 16

17 E + θ E θ X + dρθ θ More precisely to obtai the above iequalities we have used a uio boud with respect to λ Λ startig from the fact that whe we replace the ifimum i λ i the previous equatio with a fixed value of λ Λ it holds with probability at least 1 2µλδ Sice f θ dρ θ θ = f θ dρ θ θ this implies also that E θ X dρθ θ { if λ E θ X 2 dρθ θ + β + 2 log δ 1 µλ 1 } λ Λ λ E θ E θ X dρθ θ Therefore with probability at least 1 2δ { B θ = if λ λ Λ E θ X 2 + dρθ θ + β + 2 log δ 1 µλ 1 λ { λ E θ X 2 dρθ θ Eθ θex B + θ = if λ Λ } + β + 2 log δ 1 µλ 1 } λ This defies for θex a cofidece iterval of legth o greater tha { Bθ = if λ λ Λ E θ X 2 dρ θ θ + 2β + 4 log δ 1 µλ 1 Ufortuately either B + θ B θ or Bθ are observable But evertheless we ca build a estimator m such that sup θ S d { θ m Eθ } = if m R d It satisfies with probability at least 1 2δ λ sup θ S d { θm Eθ } } m EX = sup θ S d θ m EX 17

18 { } { } { } sup θ m Eθ + sup Eθ θex 2 sup Eθ θex θ S d θ S d θ S d 2 sup B + θ sup θ θ S d = sup θ S d if 2λ λ Λ if 2λ λ Λ E θ X 2 dρ θ θ + 2β + 4 log δ 1 µλ 1 λ E θ X 2 + E X 2 + 2β + 4 log δ 1 µλ 1 β λ Lemma 52 Let us choose β = 2 log δ 1 ad put v = sup θ Sd E θ X 2 ad T = E X 2 With probability at least 1 2δ m EX if Bλ λ Λ where Bλ = { 2λ v + T 2 logδ logδ log µλ 1 To tur this lemma ito a explicit boud we eed ow to choose Λ ad µ M 1 + Λ Cosider for some real parameters > 0 ad α > 1 { } α k Λ = : k Z αk For ay λ k = Λ put ad remark that Put also 1 µλ = 2 k + 1 k 0 k + 2 1/2 k = 0 µλ k λ = 1 2 k k Z 4 logδ 1 T v + 2 logδ 1 The boud Bλ appearig i the previous lemma ca be writte as 2 Bλ = 4 2v logδ 1 + T [ λ cosh log + log µλ 1 ] λ λ 4 logδ 1 λ Sice logλ k = k logα log log/2 there exists k Z such that log λ k /λ logα/2 18 λ }

19 so that Therefore k log λ / logα + 1/2 if Bλ Bλ k 4C λ Λ where the costat C is equal to [ logα α C = cosh logδ 1 log 2 2v logδ 1 + T 1 2 logα log 2v logδ 1 + T 8 2 logδ ] 2 We see that the costat 2 ca be iterpreted as our best guess of the ratio 2v logδ 1 + T 8 log δ 1 2 However this guess may be very loose without harmig the costat C too much Ideed to give a example if we choose α = e ad we assume that we made a error of magitude 10 6 o the choice of 2 compared to the optimal guess we get C cosh1/2 + exp1/2 2 logδ 1 log [ 1 2 log ] logδ 1 so that if we work at the cofidece level correspodig to δ = 1/100 we obtai that C 16 I brief the message is that C is typically betwee oe ad two 52 Adaptive estimatio of the mea of a radom matrix We cosider here the same framework as i Sectio 3 o page 3 Let M R p q be a radom matrix ad M 1 M be a sample made of idepedet copies of M Usig the asymmetric ifluece fuctio ϕ defied by equatio 3 o page 15 give ξ S p θ S q we defie the estimators { 1 E + ξθ = sup λ Λ λ { 1 E ξθ = E + ξθ = sup λ Λ λ ψ [ λ ξ M i θ + ] dνξ ξ dρ θ θ β + γ + 2 log δ 1 µλ 1 } 2λ ψ [ λ ξ M i θ ] dνξ ξ dρ θ θ β + γ + 2 log δ 1 µλ 1 } 2λ ad Eξθ = E + ξθ E ξθ 19

20 Lemma 53 With probability at least 1 2δ for ay ξ S p θ S q { if λ λ Λ so that ad E ξ Mθ 2 + dνξ ξ dρ θ θ + β + γ + 2 log δ 1 µλ 1 } λ E + ξθ E ξ Mθ + dνξ ξ dρ θ θ 0 B + ξθ Eξθ ξem θ B + ξθ { = if λ E ξ Mθ 2 dνξ ξ dρ θ θ + β + γ + 2 log δ 1 µλ 1 } λ Λ λ { if λ [E ξ Mθ 2 + E Mθ 2 + E M ξ 2 λ Λ β γ + E M HS 2 ] βγ + β + γ + 2 log δ 1 µλ 1 } λ Choose β = γ = 2 χ logδ 1 with χ > 0 Let Λ = as i the previous sectio Put v = E ξ Mθ 2 { λ k = αk : k Z } µλ k 1 2 k sup E ξ Mθ 2 = v ξ S p θ S q t = E Mθ 2 sup θ S q E Mθ 2 = t u = E M ξ 2 sup ξ S p E M ξ 2 = u T = E M 2 HS l = logδ 1 λ χl 2 = lv + t + u χ + T l χ 2 Remark that i a similar way to the case of a vector treated i the previous sectio 20

21 B + ξθ = if λ v + t + u λ Λ χl + if λ Λ χ lv + t + u χ + T { l χ 2 cosh T χ 2 l χ + 1l + 2 logµλ 1 λ [ log λ λ ] + λ log µλ 1 2λ1 + χl Replacig λ by its value choosig λ = λ k such that logλ/λ logα/2 ad remarkig that k log λ + 1 logα 2 we obtai Propositio 54 With probability at least 1 2δ for ay ξ S p ay θ S q Eξθ ξemθ Bξθ = 2C } 21 + χ v logδ 1 + t + u χ + T χ 2 logδ 1 where usig the abbreviatio l = logδ 1 logα C = cosh 2 α χl log 1 2 logα log Let us ow cosider a estimator m such that lv + t + u χ + T l χ χl sup Eξθ ξ m θ ξ S p θ S q With probability at least 1 2δ if m R p q m EM 2 sup Bξθ ξ S p θ S q sup Eξθ ξm θ ξ S p θ S q Remark that we ca boud sup ξ Sp θ S q Bξθ by the explicit expressio for Bξθ where v t ad u are replaced by their upper bouds v t ad u with respect to ξ S p ad θ S q Remark also that we ca weake the ifluece of T by choosig χ > 1 but that we ca reach the optimal boud for m EM oly if we kow a upper boud 21

22 for the ratio T/v Ideed if we kow T/v or a upper boud of the same order of magitude up to a costat we ca choose { } 1 T χ = max logδ 1 1 v I this case with probability at least 1 2δ m EM 8C v logδ 1 + t + u + v T Most likely we do ot kow T v = E M HS 2 sup ξ Sp θ S q E ξ Mθ 2 but we ca still choose χ greater tha oe to lower the ifluece of T = E M HS 2 i the boud 6 Adaptive Gram matrix estimate We devote a sectio to the adaptive estimatio of a Gram matrix sice it is a importat subject for applicatios to pricipal compoet aalysis ad to least squares regressio We recall that give a radom vector X R d the Gram matrix of X is defied as G = E X X R d d The geeral approach of the previous sectio uses a estimator that caot be computed explicitly without recourse to a Mote Carlo samplig algorithm I the special case of the Gram matrix we will produce a estimator that does ot suffer from this drawback Cosequeces of what is proved i this sectio regardig robust pricipal compoet aalysis ca easily be draw from the method exposed i Giulii 2017b We refer to this paper for further details Cosequeces regardig least squares regressio are discussed at the ed of this paper I this sectio we will use the asymmetric ifluece fuctio defied by equatio 3 o page 15 The explicit computatio of our estimator however will use the modified auxiliary fuctio ϕ 2 m = E [ ψ m + W 2] m R R + where W N01 is a stadard Gaussia radom variable Observe that it is possible to explicitly compute the fuctio ϕ 2 i terms of the Gaussia distributio fuctio Fa = PW a 22

23 Lemma 61 For ay m R ad R + ϕ 2 m = m m 4 + 6m r 2 m where r 2 m = 1 [ m m ] [ 1 m 1 + m ] F + F [ 2 3 5m 1 + m1 m 2] 1 + m2 exp 2π π [ m 1 m1 + m 2] 1 m2 exp 2 2 Proof The proof is based o the expressio ad o the idetities ψt = t t 2 /2 + 1t 11 t 2 /2 t R + E [ 1 W a ] [ ] = Fa = 1 E 1 W a E [ W1 W a ] 1 = exp a2 = E [ W1 W a ] 2π 2 E [ W 2 1 W a ] a = exp a2 + Fa = 1 E [ W 2 1 W a ] 2π 2 E [ W 3 1 W a ] a = exp a2 = E [ W 3 1 W a ] 2π 2 E [ W 4 1 W a ] a 3 + 3a = exp a2 + 3Fa = 3 E [ W 4 1 W a ] 2π 2 Let us put Gt = 1 exp t2 2π 2 E { ψ [ m + W 2] } [ = E m + W 2 1 m + W 4 ] + r2 m 2 where { [m 2r 2 m = E + W 4 2m + W ] [ 1 W 1 m W 1 m ]}

24 { [ = E m mm 2 1W + 6m W 2 + 4m 3 W W 4] [ 1 W 1 m + 1 W 1 m ]} = m 2 1 [ 2 1 m 1 + m ] F + F [ 1 m 1 m ] + 4mm 2 1 G + G [ 1 + m 1 m + 6m G + 1 m 1 m G 1 m 1 + m ] + F + F [ 1 + m 2 ] 1 m [ 1 m 2 ] 1 + m } + 4m { G G {[ 1 + m m ] 1 m [ 1 m 3 31 m ] 1 m + G + + G [ 1 m 1 + m ]} + 3 F + F so that r 2 m = 1 [ m m ] [ 1 m 1 + m ] F + F [ 2 3 5m 1 + m1 m 2] 1 m G [ m 1 m1 + m 2] 1 m G Observe ow that whe θ is distributed accordig to ρ θ = Nθ β 1 I d the real valued radom variable θ x is Gaussia with mea θ x ad stadard deviatio x / β Thus we ca state the followig Lemma 62 For ay θ x R d ψ θ x 2 dρ θ θ = ϕ 2 θ x x β Itroduce A λ β θ x = ϕ 2 λ 1/2 θ x x β log x 2 β

25 where λ R + is a costat modifyig the orm of θ Next propositio provides some upper ad lower bouds Propositio 63 With probability at least 1 δ for ay θ R d ay λ R + 1 λ A λ β θ X i β θ 2 2 logδ 1 λ E θ X 2 + E X 4 λ β 2 Moreover with probability at least 1 δ for ay θ R d ay λ R + 1 λ A λ β θ X i + β θ logδ 1 λ E θ X 2 λe θ X 4 6E X 2 θ X 2 β 3E X 4 λ β 2 Proof Accordig to Propositio 21 o page 2 with probability at least 1 δ for ay θ R d ad ay λ R + 1 λ [ ψ θ X i 2 dρ λ 1/2 θ θ log 1 + X i 2 ] β θ 2 β 2 1 { [ log E exp ψ θ X 2 X 2 log 1 + λ β + logδ 1 λ ]} dρ λ 1/2 θ θ Accordig to Lemma 51 o page t ψt log1 + u log 1 + u = log 1 u + t + u2 1 + u log1 + t u + u 2 tu R + Thus the right-had side of the previous iequality is ot greater tha 1 λ E θ X 2 dρ λ 1/2 θ θ E X 2 λ β + E X 4 λ β 2 I the same time due to Lemma 62 its left-had side is equal to = E θ X 2 + E X 4 λ β 2 1 λ A λ β θ X i β θ logδ 1 λ

26 This achieves the proof for the upper boud Let us ow come to the lower boud As a cosequece of Lemma 51 o page 15 for ay t [01] ad ay y R + ψt + log1 + y log 1 t + t 2 + log1 + y = log 1 t + t t + t 2 y log 1 t + t 2 + y Whe t [1 [ the same iequality is also obviously true: ψt + log1 + y log1 + y log1 t + t 2 + y As a cosequece for ay x R d ψ θ x 2 dρ θ θ + log 1 + x 2 β log 1 θ x 2 + θ x 4 + x 2 β dρ θ θ Thus accordig to the PAC-Bayesia iequality stated i Propositio 21 o page 2 with probability al least 1 δ for ay θ R d ad ay λ R + 1 λ [ ψ θ X i 2 dρ λ 1/2 θ θ + log 1 + X i 2 ] β θ 2 logδ 1 β 2 λ 1 { [ log E exp ψ θ X 2 X ]} 2 + log 1 + dρ λ β λ 1/2 θ θ 1 E θ X 2 + θ X 4 X 2 + dρ λ β λ 1/2 θ θ To coclude the proof it is eough to use the explicit expressio of the momets of a Gaussia radom variable rememberig that whe θ is distributed accordig to ρ λ 1/2 θ the distributio of θ X is equal to N λ 1/2 θ X X 2 /β The ext propositio defies a estimator of the quadratic form E θ X 2 Note that sice we itroduced a parameter λ that takes care of the orm of θ we will assume i the followig without loss of geerality that θ S d the uit sphere of R d Propositio 64 Let us assume that E X 4 T < 26

27 for a kow costat T For ay θ S d cosider the estimator of E θ X 2 defied as Eθ = sup λ R d 1 λ With probability at least 1 δ for ay θ S d A λ β θ X i β 2 logδ 1 λ Eθ E θ X 2 T λ β 2 Moreover with probability at least 1 δ for ay θ S d E θ X 2 Eθ + 2 2E θ X 4 2T β 2 + logδ 1 + 6E X 2 θ X 2 β + β Remark 61 Itroducig α = 2T we ca also express the previous boud as β2 E θ X 2 2 Eθ + 2 E θ X 4 [ α + logδ 1 ] 2α + 3 T E θ X 2 X 2 2T + α [ Eθ + 2 2E θ X T E θ X 2 X 2 ] α 2T 2 + α + 2 E θ X 4 logδ 1 2α Eθ + 5 E θ X 4 2T 2 + α + 2 E θ X 4 logδ 1 where the last iequality is a cosequece of the Cauchy-Schwarz iequality E θ X 2 X 2 E θ X 4 E X 4 T E θ X 4 Proof Propositio 64 follows from Propositio 63 ad the defiitio of the estimator E To get the secod iequality observe that the value of λ miimizig λe θ X 4 + 6E X 2 θ X 2 β T λ β 2 + β + 2 logδ 1 λ

28 is give by λ = 2E θ X 4 1 2T β 2 + logδ 1 I the followig propositio we make the estimator adaptive i α as well as i λ ad we itroduce our estimator Ĝ of the Gram matrix G Propositio 65 Let us assume that E X 4 T < where T is a kow costat Cosider the estimator 1 Ẽθ = sup sup λ R + k N λ log 1 + [ ϕ 2 λ θ Xi expk 10T X i 2 With probability at least 1 δ for ay θ S d 1/4 expk X i 10T ] 1 Ẽθ E θ X 2 With probability at least 1 δ for ay θ S d where E θ X 2 Ẽθ + Bθ 5T 2 expk expk 10λ log[ k + 1k + 2/δ ] θ S d λ Bθ = 2 E θ X 4 33 T E θ X 4 1/4 Cosider a estimator Ĝ R d d such that ad log 2 log T E θ X if θ S d θĝ θ Ẽθ logδ 1

29 { sup θĝ θ Ẽθ = if sup θ M θ Ẽθ : M Rd d θ S d θ S d With probability at least 1 2δ Remark 62 kurtosis Ĝ G sup θ S d Bθ } M = M 0 if θ M θ Ẽθ θ S d It is iterestig to rephrase this result i terms of the directioal E θ X 4 κθ = E θ X 2 2 E θ X 2 > 0 1 otherwise We obtai with probability at least 1 2δ 1/4 κθ 1 2 T 33 κθe θ X log 2 log T κθe θ X logδ Ẽθ E θ X 2 1 with the appropriate covetio that r/0 = + whe r > 0 ad 0/0 = 1 This iequality shows uder which circumstaces it is possible to estimate the order of magitude of E θ X 2 ad cosequetly the eigevalues of the Gram matrix G Ideed itroducig κ = sup θ Sd κθ we deduce with probability at least 1 2δ a boud of the form 1 f κ E θ X 2 Ẽθ E θ X 2 1 where the fuctio Fκ = 1 f κ / is o-decreasig Let us write G = E X X as d G = i e i e i where e 1 e d is a orthoormal basis of eigevectors ad where 1 2 d are the eigevalues of G couted with their multiplicities ad sorted i 29

30 decreasig order Itroducig L i the set of all liear subspaces of R d of dimesio i it is well kow that i = sup { if { θgθ θ L S d } L Li } A proof ca for istace be foud i Kato 1982 page 62 Based o this formula we ca itroduce the estimator It is such that i = sup { if { Ẽθ θ L S d } L Li } F κ i = F κ sup { if { θgθ θ L S d } L Li } = sup { if { F κ θgθ θ L S d } L Li } provig that with probability at least 1 2δ i sup { if { θgθ θ L S d } L Li } = i 1 f κ i i i 1 1 i d Proof of Propositio 65 o page 28 The optimal value of α i the last boud give i Remark 61 o page 27 is give by α = 1 T 5 E θ X 4 1 E X 4 5 E θ X Accordig to the simplified iequality stated at the ed of Remark 61 with probability at least 1 δ for ay θ S d E θ X 2 Eθ + 2 = Eθ [ TE θ X 4 ] 1/4 α/α + α /α E θ X 4 logδ 1 [ TE θ X 4 ] 1/4 1 cosh 2 logα/α E θ X 4 logδ 1 We will take a weighted uio boud o all values of α belogig to { expk/5 : k N } To perform this we have to modify accordigly the defiitio of the estimator ad cosider the estimator Ẽ defied i the propositio I this 30

31 10 T δ chage of defiitio we have replaced β with ad δ with expk k + 1k + 2 ad we have take the supremum i k N as well as i λ R + As k N δ k + 1k + 2 = δ we get from Propositio 63 o page 25 that with probability at least 1 δ for ay θ S d Ẽθ E θ X 2 Recallig that α = 2T β 2 = expk we get with 5 probability at least 1 δ for ay θ S d E θ X 2 Ẽθ + if k N 2 10 [ TE θ X 4 ] 1/4 1 expk cosh 2 log + 2 5α 2 E θ X 4 log [ k /δ ] We ca take the ifimum i k because the iequality holds with probability 1 δ for ay value of k N We ca ow choose k to be the closest iteger to log5α that is kow to be a o-egative quatity It is such that expk log 1 5α 2 ad therefore k + 2 log 5 5α + 2 = 1 2 log T E θ X Remarkig that 10 cosh1/4 33 eds the proof 7 Liear least squares regressio Cosider a couple of radom variables XY R d R whose distributio is assumed to be ukow Let X 1 Y 1 X Y be a observed sample made of idepedet copies of XY I this sectio we cosider the questio of estimatig Itroduce the Gram matrix if θ E[ θ X Y 2] G = E X X R d d 31

32 the vector ad the risk fuctio Remark that V = E Y X R d Rθ = θgθ 2 θv E [ θ X Y 2] = EY 2 + Rθ θ R d so that miimizig the quadratic loss is equivalet to miimizig R We have see i the previous sectios various methods to estimate G ad V As a straightforward cosequece we state a first result cocerig the miimizatio over a bouded domai Propositio 71 Assume that Ĝ R d d ad V R d are such that 4 Ĝ G ɛ ad V V η Assume also that Ĝ is a symmetric positive semi-defiite matrix Let Θ be a closed bouded set i R d ad let B = sup θ Cosider the estimated risk θ Θ ad a estimator θ arg mi Θ Proof Remark that Rθ = θĝ θ 2 θ V θ R d R It is such that R θ if Θ R 2B ɛ B + 2η R θ R θ + B 2 ɛ + 2Bη = if Rθ + B 2 ɛ + 2Bη θ Θ if Rθ + θ Θ 2B2 ɛ + 4Bη Corollary 72 Assume that we kow costats vtv T such that sup E θ X 4 v < θ S d E X 4 T < sup E Y 2 θ X 2 v < θ S d 32

33 E Y 2 X 2 T < Usig Propositios 33 o page 6 ad 42 o page 11 we ca defie estimators Ĝ ad V such that with probability at least 1 2δ 2v Ĝ G ɛ = 2 2 logδ T/v ad V V η = 2 T / + 2v logδ 1 / Cosequetly the estimator θ of the previous propositio based o Ĝ ad V is such that with probability at least 1 2δ logδ R θ if R O 1 Θ where the costat hidig behid the otatio O depeds oly o vtv T ad θ sup θ Θ Remark 71 We get oly a slow speed of order 1/2 ad ot 1 but we thik it is the price to pay to have a dimesio-free boud uder such hypotheses I the followig we will release the costrait that θ belogs to a bouded domai We will also propose coditios uder which a fast rate of order O logδ 1 / is possible We will be iterested first i defiig some o-asymptotic cofidece regio for θ arg mi θ R d Rθ We will broade our aalysis to the estimatio of the ridge regressio θ λ arg mi θ R d Rθ + λ θ 2 sice this extesio is quite atural i this cotext Ideed the ridge regressio problem cosists i miimizig R o a ball cetered at the origi ad ridge regressors as we will see will ayhow play a role i the defiitio of a robust estimator Propositio 73 Make the same assumptios as at the begiig of Propositio 71 o the precedig page ad cosider some parameter λ R + Itroduce the ridge regressio loss fuctio ad its empirical couterpart R λ θ = Rθ + λ θ 2 = θ G + λiθ 2 θv R λ θ = Rθ + λ θ 2 = θ Ĝ + λiθ 2 θ V Let θ λ arg mi θ R d R λ ad θ λ arg mi θ R d R λ θ Defie the cofidece regio Θ λ = { θ R d : Ĝ + λ θ θ λ θ ɛ + η } 33

34 O the evet defied by equatio 4 o page 32 θ λ Θ λ Moreover for ay estimator θ Θ λ the improved pick { θ arg mi R λ θ R λ θ + ɛ θ θ θ θ ɛ θ + η } θ R d is such that ad more precisely such that R λ θ < R λ θ R λ θ R λ θ R λ θ R λ θ + θ θ ɛ θ + θ + 2η < 0 Proof Note that for ay θ ξ R d R λ ξ R λ θ = θ ξ G + λiθ + ξ 2 θ ξv R λ ξ R λ θ + ξ θ ɛ ξ + θ + 2η R λ ξ R λ θ + ɛ ξ θ ξ θ ɛ θ + η def = γ λ θ ξ As ξ γ λ θ ξ is strictly covex if ξ R d γ λ θ ξ = 0 = γ λ θθ if ad oly if its subdifferetial satisfies 0 ξ ξ=θ γ λ θ ξ = 2Ĝ + λiθ 2 V + 2B d ɛ θ + η where B d is the uit ball of R d Remarkig that V = Ĝ + λi θ λ we see that this is equivalet to Ĝ + λiθ θ λ ɛ θ + η To complete the proof it is eough to remark that due to its defiitio 0 if ξ R d γ λ θ λ ξ if ξ R d R λ ξ R λ θ λ = 0 so that θ λ Θ λ Note that θ is the solutio of a strictly covex miimizatio problem It is characterized by the equatio Ĝ + λi θ V + ɛ θ θ + θ θ θ θ ɛ θ + η = 0 I view of the shape of the cofidece regio it is atural to cosider the estimator θ λ arg mi θ Θ λ θ 34

35 Propositio 74 Let ξ Θ λ be ay parameter value withi the above defied cofidece regio Uder the evet defied by equatio 4 o page 32 it is such that G + λi ξ θ λ 2 ɛ ξ + η I particular sice θ λ Θ λ we see from the defiitio of θ λ that θ λ θ λ ad therefore that G + λi θ λ θ λ 2 4 ɛ θ λ + η 2 Thus whe ɛ = O logδ 1 / ad η = O logδ 1 / we get a covergece speed of order O logδ 1 / but for a modified defiitio of the loss fuctio Usig a basis e i 1 i d of eigevectors of G with correspodig eigevalues 1 2 d 0 we see more precisely that for ay θ R d whereas R λ θ R λ θ λ = G + λi θ θ λ 2 = d i + λ θ θ λ e i 2 d i + λ 2 θ θ λ e i 2 = 1 4 R λ θ 2 The relatio betwee the two risks is that Cosequetly d + λ [ R λ θ R λ θ λ ] G + λiθ θ λ 2 R λ θ λ R λ θ λ Proof For ay ξ Θ λ 1 + λ [ R λ θ R λ θ λ ] 4 ɛ θ λ + η 2 d + λ 4 ɛ θλ + η 2 d + λ G+λIξ θ λ = G+λIξ V Ĝ+λIξ V +ɛ ξ +η 2 ɛ ξ +η from which the other statemets made i the propositio are straightforward cosequeces From this propositio we coclude that we have a dimesio-free boud for G + λi θ λ θ λ 2 whereas the boud we obtai for R λ θ λ R λ θ λ depeds o the dimesio through d + λ so that it is dimesio-free oly for large eough values of λ 35

36 For small values of λ depedig o we ca obtai a dimesio-free slow rate i the followig way Remark that sice i i + λ 2 4λ d R 0 θ λ R 0 θ 0 = i θ λ θ 0 e i 2 Sice V = Gθ 0 = G + λiθ λ d i + λ 2 θ λ θ 0 e i 2 4λ = 1 4λ G + λi θ λ θ 0 2 G + λi θ λ θ 0 = G + λi θ λ θ λ λθ 0 Moreover θ λ θ 0 ideed Therefore G + λi θ λ θ λ + λ θ 0 2 ɛ θ λ + η + λ θ 0 R λ θ λ = R 0 θ λ + λ θ λ 2 R 0 θ 0 + λ θ 0 2 R 0 θ λ + λ θ 0 2 ad comig back to R 0 Choose λ = 2ɛ + η to obtai G + λi θ λ θ 0 2 [ ɛ + λ/2 θ 0 + η ] R 0 θ λ R 0 θ 0 1 [ ɛ + λ/2 θ0 + η ]2 λ R 0 θ 2ɛ+η R 0 θ 0 [ θ 0 + 1/2 ] [ 2ɛ + η θ 0 + η ] This is a dimesio-free boud for R 0 θ λ R 0 θ 0 but it is of order O logδ 1 / istead of O logδ 1 / Notice that it is adaptive i θ 0 though To get faster dimesio-free rates for R 0 θ we eed to itroduce some restrictios First of all let us otice that the previous results hold uiformly i ay liear subspace of R d Propositio 75 Let us make the same assumptios as i Propositio 71 o page 32 For ay liear subspace L of R d defie θ L λ arg mi ξ L R λ ξ 36

37 Let θ L λ arg mi R λ ξ ξ L be the orthogoal projectio o L ad let π L θ = arg mi ξ θ ξ L Θ L λ = { ξ L : π L Ĝ + λiξ θ L λ ɛ ξ + η } ad θ L λ arg mi ξ Θ L λ ξ Fially itroduce the least eigevalue of π L Gπ L L = if { Gξ : ξ L ξ = 1 } Wheever equatio 4 o page 32 is satisfied for ay liear subspace L of R d ad ay parameter λ R + π L G + λi θ L λ θ L λ 2 4 ɛ θ L λ + η 2 4 ɛ θ L λ + η 2 4 ad R λ θ L λ R λ θ L λ ɛ θ L λ + η 2 L + λ 4 ɛ θl λ + η 2 L + λ Remark that we ca estimate L by It is such that for ay liear subspace L L = if { Ĝξ : ξ L ξ = 1 } L ɛ L L + ɛ Obtaiig a fast covergece rate for the miimizatio of R λ θ whe λ is small or ull ad d is small is possible i a sparse recovery framework Propositio 76 Cosider a family L of liear subspaces of R d Assume that θ λ L L ad that θ λ A a kow costat Cosider the cofidece regio Θ λ = { ξ R d : Ĝ + λi ξ θ λ ɛ ξ + η ξ A } 37

38 Defie the model selector L = { L L : Θ λ L } L arg max { L : L L } ad the estimator θ arg mi { ξ : ξ Θ λ L } Defie = if { L+Rθλ : L L L L 2ɛ } Uder the evet described by equatio 4 o page 32 + λ θ θ λ G + λi θ θ λ 2 ɛ θ + η 2 ɛ A + η ad R λ θ R λ θ λ 4 λ + ɛ θ + η 2 4 λ + ɛ A + η 2 Proof Sice θ Θ λ G + λi θ θ λ 2 ɛ θ + η 2 ɛ A + η O the other had G + λi θ θ λ π L+Rθ λ G + λi θ θ λ L+Rθ λ + λ θ θ λ Moreover L L sice θ λ Θ λ L Thus L L ɛ L ɛ L 2ɛ so that L+Rθ λ accordig to the defiitio of implyig that + λ θ θ λ G + λi θ θ λ 2 ɛ θ + η ad cosequetly that R λ θ R λ θ λ θ θ λ G + λi θ θ λ 4 ɛ θ + η 2 + λ Remark that the costat is defied i terms of restricted eigevalues of the Gram matrix a cocept that has bee used by other authors for example i Bickel Ritov ad Tsybakov 2009 to set the coditios of sparse recovery I the case of ested models we ca replace the costat with a simpler oe as i the followig propositio 38

39 Propositio 77 Cosider a ested family of liear subspaces of R d L = { L 1 L 2 L K } Assume that θ λ L L where L is ukow ad that θ λ A where A is kow Cosider the cofidece regio Θ λ = { ξ R d : Ĝ + λi ξ θ λ ɛ ξ + η ξ A } Defie the model selector k = arg mi { j : Θ λ L j } L = L k ad the estimator θ arg mi { ξ : ξ Θ λ L } Uder the evet described by equatio 4 o page 32 L + λ θ θ λ G + λi θ θ λ 2 ɛ θ + η 2ɛ A + η ad R λ θ R λ θ λ 4 λ + L ɛ θ + η 2 4 λ + L ɛ A + η 2 Proof As i the previous propositio θ Θ λ so that G + λi θ θ λ 2 ɛ θ + η Moreover L Θ λ so that L L implyig that + λ θ θ λ π L G + λi θ θ λ G + λi θ θ λ ad that R λ θ R λ θ λ Refereces 4 ɛ θ + η 2 + λ Bickel P J Ritov Y ad Tsybakov A 2009 Simultaeous aalysis of Lasso ad Datzig selector Aals of Statistics Catoi O 2004 Statistical Learig Theory ad Stochastic Optimizatio Lectures o Probability Theory ad Statistics École d Été de Probabilités de Sait-Flour XXXI 2001 Lecture Notes i Mathematics 1851 Spriger pages Catoi O 2012 Challegig the empirical mea ad empirical variace: a deviatio study A Ist Heri Poicaré

40 Catoi O 2016 PAC-Bayesia bouds for the Gram matrix ad least squares regressio with a radom desig preprit o ArXiv Catoi O ad Giulii I 2017 Dimesio free PAC-Bayesia bouds for the estimatio of the mea of a radom vector I NIPS 2017 to appear Giulii I 2017a Robust dimesio-free Gram operator estimates Beroulli to appear Giulii I 2017b Robust PCA ad pairs of projectios i a Hilbert space Electro J Statist Joly E Lugosi G ad Oliveira R I 2017 O the estimatio of the mea of a radom vector Electroic Joural of Statistics Kato T 1982 A Short Itroductio to Perturbatio Theory for Liear Operators Spriger-Verlag New York Lugosi G ad Medelso S 2017 Sub-Gaussia estimators of the mea of a radom vector Aals of Statistics to appear Misker S 2015 Geometric Media ad Robust Estimatio i Baach Spaces Beroulli Misker S 2016 Sub-Gaussia estimators of the mea of a radom matrix with heavy-tailed etries Aals of Statistics to appear 40

Homework for 1/27 Due 2/5

Homework for 1/27 Due 2/5 Name: ID: Homework for /7 Due /5. [ 8-3] I Example D of Sectio 8.4, the pdf of the populatio distributio is + αx x f(x α) =, α, otherwise ad the method of momets estimate was foud to be ˆα = 3X (where

Διαβάστε περισσότερα

Lecture 17: Minimum Variance Unbiased (MVUB) Estimators

Lecture 17: Minimum Variance Unbiased (MVUB) Estimators ECE 830 Fall 2011 Statistical Sigal Processig istructor: R. Nowak, scribe: Iseok Heo Lecture 17: Miimum Variace Ubiased (MVUB Estimators Ultimately, we would like to be able to argue that a give estimator

Διαβάστε περισσότερα

Solutions: Homework 3

Solutions: Homework 3 Solutios: Homework 3 Suppose that the radom variables Y,, Y satisfy Y i = βx i + ε i : i,, where x,, x R are fixed values ad ε,, ε Normal0, σ ) with σ R + kow Fid ˆβ = MLEβ) IND Solutio: Observe that Y

Διαβάστε περισσότερα

Last Lecture. Biostatistics Statistical Inference Lecture 19 Likelihood Ratio Test. Example of Hypothesis Testing.

Last Lecture. Biostatistics Statistical Inference Lecture 19 Likelihood Ratio Test. Example of Hypothesis Testing. Last Lecture Biostatistics 602 - Statistical Iferece Lecture 19 Likelihood Ratio Test Hyu Mi Kag March 26th, 2013 Describe the followig cocepts i your ow words Hypothesis Null Hypothesis Alterative Hypothesis

Διαβάστε περισσότερα

1. For each of the following power series, find the interval of convergence and the radius of convergence:

1. For each of the following power series, find the interval of convergence and the radius of convergence: Math 6 Practice Problems Solutios Power Series ad Taylor Series 1. For each of the followig power series, fid the iterval of covergece ad the radius of covergece: (a ( 1 x Notice that = ( 1 +1 ( x +1.

Διαβάστε περισσότερα

Introduction of Numerical Analysis #03 TAGAMI, Daisuke (IMI, Kyushu University)

Introduction of Numerical Analysis #03 TAGAMI, Daisuke (IMI, Kyushu University) Itroductio of Numerical Aalysis #03 TAGAMI, Daisuke (IMI, Kyushu Uiversity) web page of the lecture: http://www2.imi.kyushu-u.ac.jp/~tagami/lec/ Strategy of Numerical Simulatios Pheomea Error modelize

Διαβάστε περισσότερα

p n r.01.05.10.15.20.25.30.35.40.45.50.55.60.65.70.75.80.85.90.95

p n r.01.05.10.15.20.25.30.35.40.45.50.55.60.65.70.75.80.85.90.95 r r Table 4 Biomial Probability Distributio C, r p q This table shows the probability of r successes i idepedet trials, each with probability of success p. p r.01.05.10.15.0.5.30.35.40.45.50.55.60.65.70.75.80.85.90.95

Διαβάστε περισσότερα

Other Test Constructions: Likelihood Ratio & Bayes Tests

Other Test Constructions: Likelihood Ratio & Bayes Tests Other Test Constructions: Likelihood Ratio & Bayes Tests Side-Note: So far we have seen a few approaches for creating tests such as Neyman-Pearson Lemma ( most powerful tests of H 0 : θ = θ 0 vs H 1 :

Διαβάστε περισσότερα

n r f ( n-r ) () x g () r () x (1.1) = Σ g() x = Σ n f < -n+ r> g () r -n + r dx r dx n + ( -n,m) dx -n n+1 1 -n -1 + ( -n,n+1)

n r f ( n-r ) () x g () r () x (1.1) = Σ g() x = Σ n f < -n+ r> g () r -n + r dx r dx n + ( -n,m) dx -n n+1 1 -n -1 + ( -n,n+1) 8 Higher Derivative of the Product of Two Fuctios 8. Leibiz Rule about the Higher Order Differetiatio Theorem 8.. (Leibiz) Whe fuctios f ad g f g are times differetiable, the followig epressio holds. r

Διαβάστε περισσότερα

SUPERPOSITION, MEASUREMENT, NORMALIZATION, EXPECTATION VALUES. Reading: QM course packet Ch 5 up to 5.6

SUPERPOSITION, MEASUREMENT, NORMALIZATION, EXPECTATION VALUES. Reading: QM course packet Ch 5 up to 5.6 SUPERPOSITION, MEASUREMENT, NORMALIZATION, EXPECTATION VALUES Readig: QM course packet Ch 5 up to 5. 1 ϕ (x) = E = π m( a) =1,,3,4,5 for xa (x) = πx si L L * = πx L si L.5 ϕ' -.5 z 1 (x) = L si

Διαβάστε περισσότερα

The Heisenberg Uncertainty Principle

The Heisenberg Uncertainty Principle Chemistry 460 Sprig 015 Dr. Jea M. Stadard March, 015 The Heiseberg Ucertaity Priciple A policema pulls Werer Heiseberg over o the Autobah for speedig. Policema: Sir, do you kow how fast you were goig?

Διαβάστε περισσότερα

LAD Estimation for Time Series Models With Finite and Infinite Variance

LAD Estimation for Time Series Models With Finite and Infinite Variance LAD Estimatio for Time Series Moels With Fiite a Ifiite Variace Richar A. Davis Colorao State Uiversity William Dusmuir Uiversity of New South Wales 1 LAD Estimatio for ARMA Moels fiite variace ifiite

Διαβάστε περισσότερα

Degenerate Perturbation Theory

Degenerate Perturbation Theory R.G. Griffi BioNMR School page 1 Degeerate Perturbatio Theory 1.1 Geeral Whe cosiderig the CROSS EFFECT it is ecessary to deal with degeerate eergy levels ad therefore degeerate perturbatio theory. The

Διαβάστε περισσότερα

L.K.Gupta (Mathematic Classes) www.pioeermathematics.com MOBILE: 985577, 4677 + {JEE Mai 04} Sept 0 Name: Batch (Day) Phoe No. IT IS NOT ENOUGH TO HAVE A GOOD MIND, THE MAIN THING IS TO USE IT WELL Marks:

Διαβάστε περισσότερα

Adaptive Covariance Estimation with model selection

Adaptive Covariance Estimation with model selection Adaptive Covariace Estimatio with model selectio Rolado Biscay, Hélèe Lescorel ad Jea-Michel Loubes arxiv:03007v [mathst Mar 0 Abstract We provide i this paper a fully adaptive pealized procedure to select

Διαβάστε περισσότερα

Ψηφιακή Επεξεργασία Εικόνας

Ψηφιακή Επεξεργασία Εικόνας ΠΑΝΕΠΙΣΤΗΜΙΟ ΙΩΑΝΝΙΝΩΝ ΑΝΟΙΚΤΑ ΑΚΑΔΗΜΑΪΚΑ ΜΑΘΗΜΑΤΑ Ψηφιακή Επεξεργασία Εικόνας Φιλτράρισμα στο πεδίο των συχνοτήτων Διδάσκων : Αναπληρωτής Καθηγητής Νίκου Χριστόφορος Άδειες Χρήσης Το παρόν εκπαιδευτικό

Διαβάστε περισσότερα

Μια εισαγωγή στα Μαθηματικά για Οικονομολόγους

Μια εισαγωγή στα Μαθηματικά για Οικονομολόγους Μια εισαγωγή στα Μαθηματικά για Οικονομολόγους Μαθηματικά Ικανές και αναγκαίες συνθήκες Έστω δυο προτάσεις Α και Β «Α είναι αναγκαία συνθήκη για την Β» «Α είναι ικανή συνθήκη για την Β» Α is ecessary for

Διαβάστε περισσότερα

Biorthogonal Wavelets and Filter Banks via PFFS. Multiresolution Analysis (MRA) subspaces V j, and wavelet subspaces W j. f X n f, τ n φ τ n φ.

Biorthogonal Wavelets and Filter Banks via PFFS. Multiresolution Analysis (MRA) subspaces V j, and wavelet subspaces W j. f X n f, τ n φ τ n φ. Chapter 3. Biorthogoal Wavelets ad Filter Baks via PFFS 3.0 PFFS applied to shift-ivariat subspaces Defiitio: X is a shift-ivariat subspace if h X h( ) τ h X. Ex: Multiresolutio Aalysis (MRA) subspaces

Διαβάστε περισσότερα

On Generating Relations of Some Triple. Hypergeometric Functions

On Generating Relations of Some Triple. Hypergeometric Functions It. Joural of Math. Aalysis, Vol. 5,, o., 5 - O Geeratig Relatios of Some Triple Hypergeometric Fuctios Fadhle B. F. Mohse ad Gamal A. Qashash Departmet of Mathematics, Faculty of Educatio Zigibar Ade

Διαβάστε περισσότερα

Statistical Inference I Locally most powerful tests

Statistical Inference I Locally most powerful tests Statistical Inference I Locally most powerful tests Shirsendu Mukherjee Department of Statistics, Asutosh College, Kolkata, India. shirsendu st@yahoo.co.in So far we have treated the testing of one-sided

Διαβάστε περισσότερα

Three Classical Tests; Wald, LM(Score), and LR tests

Three Classical Tests; Wald, LM(Score), and LR tests Eco 60 Three Classical Tests; Wald, MScore, ad R tests Suppose that we have the desity l y; θ of a model with the ull hypothesis of the form H 0 ; θ θ 0. et θ be the lo-likelihood fuctio of the model ad

Διαβάστε περισσότερα

IIT JEE (2013) (Trigonomtery 1) Solutions

IIT JEE (2013) (Trigonomtery 1) Solutions L.K. Gupta (Mathematic Classes) www.pioeermathematics.com MOBILE: 985577, 677 (+) PAPER B IIT JEE (0) (Trigoomtery ) Solutios TOWARDS IIT JEE IS NOT A JOURNEY, IT S A BATTLE, ONLY THE TOUGHEST WILL SURVIVE

Διαβάστε περισσότερα

INTEGRATION OF THE NORMAL DISTRIBUTION CURVE

INTEGRATION OF THE NORMAL DISTRIBUTION CURVE INTEGRATION OF THE NORMAL DISTRIBUTION CURVE By Tom Irvie Email: tomirvie@aol.com March 3, 999 Itroductio May processes have a ormal probability distributio. Broadbad radom vibratio is a example. The purpose

Διαβάστε περισσότερα

ST5224: Advanced Statistical Theory II

ST5224: Advanced Statistical Theory II ST5224: Advanced Statistical Theory II 2014/2015: Semester II Tutorial 7 1. Let X be a sample from a population P and consider testing hypotheses H 0 : P = P 0 versus H 1 : P = P 1, where P j is a known

Διαβάστε περισσότερα

Bessel function for complex variable

Bessel function for complex variable Besse fuctio for compex variabe Kauhito Miuyama May 4, 7 Besse fuctio The Besse fuctio Z ν () is the fuctio wich satisfies + ) ( + ν Z ν () =. () Three kids of the soutios of this equatio are give by {

Διαβάστε περισσότερα

Diane Hu LDA for Audio Music April 12, 2010

Diane Hu LDA for Audio Music April 12, 2010 Diae Hu LDA for Audio Music April, 00 Terms Model Terms (per sog: Variatioal Terms: p( α Γ( i α i i Γ(α i p( p(, β p(c, A j Σ i α i i i ( V / ep β (i j ij (3 q( γ Γ( i γ i i Γ(γ i q( φ q( ω { } (c A T

Διαβάστε περισσότερα

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8 questions or comments to Dan Fetter 1

Econ 2110: Fall 2008 Suggested Solutions to Problem Set 8  questions or comments to Dan Fetter 1 Eon : Fall 8 Suggested Solutions to Problem Set 8 Email questions or omments to Dan Fetter Problem. Let X be a salar with density f(x, θ) (θx + θ) [ x ] with θ. (a) Find the most powerful level α test

Διαβάστε περισσότερα

Partial Differential Equations in Biology The boundary element method. March 26, 2013

Partial Differential Equations in Biology The boundary element method. March 26, 2013 The boundary element method March 26, 203 Introduction and notation The problem: u = f in D R d u = ϕ in Γ D u n = g on Γ N, where D = Γ D Γ N, Γ D Γ N = (possibly, Γ D = [Neumann problem] or Γ N = [Dirichlet

Διαβάστε περισσότερα

Homework 4.1 Solutions Math 5110/6830

Homework 4.1 Solutions Math 5110/6830 Homework 4. Solutios Math 5/683. a) For p + = αp γ α)p γ α)p + γ b) Let Equilibria poits satisfy: p = p = OR = γ α)p ) γ α)p + γ = α γ α)p ) γ α)p + γ α = p ) p + = p ) = The, we have equilibria poits

Διαβάστε περισσότερα

Proof of Lemmas Lemma 1 Consider ξ nt = r

Proof of Lemmas Lemma 1 Consider ξ nt = r Supplemetary Material to "GMM Estimatio of Spatial Pael Data Models with Commo Factors ad Geeral Space-Time Filter" (Not for publicatio) Wei Wag & Lug-fei Lee April 207 Proof of Lemmas Lemma Cosider =

Διαβάστε περισσότερα

The Equivalence Theorem in Optimal Design

The Equivalence Theorem in Optimal Design he Equivalece heorem i Optimal Desig Raier Schwabe & homas Schmelter, Otto vo Guericke Uiversity agdeburg Bayer Scherig Pharma, Berli rschwabe@ovgu.de PODE 007 ay 4, 007 Outlie Prologue: Simple eamples.

Διαβάστε περισσότερα

Concrete Mathematics Exercises from 30 September 2016

Concrete Mathematics Exercises from 30 September 2016 Concrete Mathematics Exercises from 30 September 2016 Silvio Capobianco Exercise 1.7 Let H(n) = J(n + 1) J(n). Equation (1.8) tells us that H(2n) = 2, and H(2n+1) = J(2n+2) J(2n+1) = (2J(n+1) 1) (2J(n)+1)

Διαβάστε περισσότερα

2 Composition. Invertible Mappings

2 Composition. Invertible Mappings Arkansas Tech University MATH 4033: Elementary Modern Algebra Dr. Marcel B. Finan Composition. Invertible Mappings In this section we discuss two procedures for creating new mappings from old ones, namely,

Διαβάστε περισσότερα

Every set of first-order formulas is equivalent to an independent set

Every set of first-order formulas is equivalent to an independent set Every set of first-order formulas is equivalent to an independent set May 6, 2008 Abstract A set of first-order formulas, whatever the cardinality of the set of symbols, is equivalent to an independent

Διαβάστε περισσότερα

4.6 Autoregressive Moving Average Model ARMA(1,1)

4.6 Autoregressive Moving Average Model ARMA(1,1) 84 CHAPTER 4. STATIONARY TS MODELS 4.6 Autoregressive Moving Average Model ARMA(,) This section is an introduction to a wide class of models ARMA(p,q) which we will consider in more detail later in this

Διαβάστε περισσότερα

Lecture 3: Asymptotic Normality of M-estimators

Lecture 3: Asymptotic Normality of M-estimators Lecture 3: Asymptotic Istructor: Departmet of Ecoomics Staford Uiversity Prepared by Webo Zhou, Remi Uiversity Refereces Takeshi Amemiya, 1985, Advaced Ecoometrics, Harvard Uiversity Press Newey ad McFadde,

Διαβάστε περισσότερα

Supplement to A theoretical framework for Bayesian nonparametric regression: random series and rates of contraction

Supplement to A theoretical framework for Bayesian nonparametric regression: random series and rates of contraction Supplemet to A theoretical framework for Bayesia oparametric regressio: radom series ad rates of cotractio A Proof of Theorem 31 Proof of Theorem 31 First defie the followig quatity: ɛ = 3 t α, δ = α α

Διαβάστε περισσότερα

α β

α β 6. Eerg, Mometum coefficiets for differet velocit distributios Rehbock obtaied ) For Liear Velocit Distributio α + ε Vmax { } Vmax ε β +, i which ε v V o Give: α + ε > ε ( α ) Liear velocit distributio

Διαβάστε περισσότερα

Presentation of complex number in Cartesian and polar coordinate system

Presentation of complex number in Cartesian and polar coordinate system 1 a + bi, aεr, bεr i = 1 z = a + bi a = Re(z), b = Im(z) give z = a + bi & w = c + di, a + bi = c + di a = c & b = d The complex cojugate of z = a + bi is z = a bi The sum of complex cojugates is real:

Διαβάστε περισσότερα

derivation of the Laplacian from rectangular to spherical coordinates

derivation of the Laplacian from rectangular to spherical coordinates derivation of the Laplacian from rectangular to spherical coordinates swapnizzle 03-03- :5:43 We begin by recognizing the familiar conversion from rectangular to spherical coordinates (note that φ is used

Διαβάστε περισσότερα

On Certain Subclass of λ-bazilevič Functions of Type α + iµ

On Certain Subclass of λ-bazilevič Functions of Type α + iµ Tamsui Oxford Joural of Mathematical Scieces 23(2 (27 141-153 Aletheia Uiversity O Certai Subclass of λ-bailevič Fuctios of Type α + iµ Zhi-Gag Wag, Chu-Yi Gao, ad Shao-Mou Yua College of Mathematics ad

Διαβάστε περισσότερα

C.S. 430 Assignment 6, Sample Solutions

C.S. 430 Assignment 6, Sample Solutions C.S. 430 Assignment 6, Sample Solutions Paul Liu November 15, 2007 Note that these are sample solutions only; in many cases there were many acceptable answers. 1 Reynolds Problem 10.1 1.1 Normal-order

Διαβάστε περισσότερα

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS CHAPTER 5 SOLVING EQUATIONS BY ITERATIVE METHODS EXERCISE 104 Page 8 1. Find the positive root of the equation x + 3x 5 = 0, correct to 3 significant figures, using the method of bisection. Let f(x) =

Διαβάστε περισσότερα

Research Article Finite-Step Relaxed Hybrid Steepest-Descent Methods for Variational Inequalities

Research Article Finite-Step Relaxed Hybrid Steepest-Descent Methods for Variational Inequalities Hidawi Publishig Corporatio Joural of Iequalities ad Applicatios Volume 2008, Article ID 598632, 13 pages doi:10.1155/2008/598632 Research Article Fiite-Step Relaxed Hybrid Steepest-Descet Methods for

Διαβάστε περισσότερα

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University

Estimation for ARMA Processes with Stable Noise. Matt Calder & Richard A. Davis Colorado State University Estimation for ARMA Processes with Stable Noise Matt Calder & Richard A. Davis Colorado State University rdavis@stat.colostate.edu 1 ARMA processes with stable noise Review of M-estimation Examples of

Διαβάστε περισσότερα

FREE VIBRATION OF A SINGLE-DEGREE-OF-FREEDOM SYSTEM Revision B

FREE VIBRATION OF A SINGLE-DEGREE-OF-FREEDOM SYSTEM Revision B FREE VIBRATION OF A SINGLE-DEGREE-OF-FREEDOM SYSTEM Revisio B By Tom Irvie Email: tomirvie@aol.com February, 005 Derivatio of the Equatio of Motio Cosier a sigle-egree-of-freeom system. m x k c where m

Διαβάστε περισσότερα

1. Matrix Algebra and Linear Economic Models

1. Matrix Algebra and Linear Economic Models Matrix Algebra ad Liear Ecoomic Models Refereces Ch 3 (Turkigto); Ch 4 5 (Klei) [] Motivatio Oe market equilibrium Model Assume perfectly competitive market: Both buyers ad sellers are price-takers Demad:

Διαβάστε περισσότερα

Uniform Convergence of Fourier Series Michael Taylor

Uniform Convergence of Fourier Series Michael Taylor Uniform Convergence of Fourier Series Michael Taylor Given f L 1 T 1 ), we consider the partial sums of the Fourier series of f: N 1) S N fθ) = ˆfk)e ikθ. k= N A calculation gives the Dirichlet formula

Διαβάστε περισσότερα

true value θ. Fisher information is meaningful for families of distribution which are regular: W (x) f(x θ)dx

true value θ. Fisher information is meaningful for families of distribution which are regular: W (x) f(x θ)dx Fisher Iformatio April 6, 26 Debdeep Pati Fisher Iformatio Assume X fx θ pdf or pmf with θ Θ R. Defie I X θ E θ [ θ log fx θ 2 ] where θ log fx θ is the derivative of the log-likelihood fuctio evaluated

Διαβάστε περισσότερα

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch:

HOMEWORK 4 = G. In order to plot the stress versus the stretch we define a normalized stretch: HOMEWORK 4 Problem a For the fast loading case, we want to derive the relationship between P zz and λ z. We know that the nominal stress is expressed as: P zz = ψ λ z where λ z = λ λ z. Therefore, applying

Διαβάστε περισσότερα

Tridiagonal matrices. Gérard MEURANT. October, 2008

Tridiagonal matrices. Gérard MEURANT. October, 2008 Tridiagonal matrices Gérard MEURANT October, 2008 1 Similarity 2 Cholesy factorizations 3 Eigenvalues 4 Inverse Similarity Let α 1 ω 1 β 1 α 2 ω 2 T =......... β 2 α 1 ω 1 β 1 α and β i ω i, i = 1,...,

Διαβάστε περισσότερα

Outline. Detection Theory. Background. Background (Cont.)

Outline. Detection Theory. Background. Background (Cont.) Outlie etectio heory Chapter7. etermiistic Sigals with Ukow Parameters afiseh S. Mazloum ov. 3th Backgroud Importace of sigal iformatio Ukow amplitude Ukow arrival time Siusoidal detectio Classical liear

Διαβάστε περισσότερα

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions

SCHOOL OF MATHEMATICAL SCIENCES G11LMA Linear Mathematics Examination Solutions SCHOOL OF MATHEMATICAL SCIENCES GLMA Linear Mathematics 00- Examination Solutions. (a) i. ( + 5i)( i) = (6 + 5) + (5 )i = + i. Real part is, imaginary part is. (b) ii. + 5i i ( + 5i)( + i) = ( i)( + i)

Διαβάστε περισσότερα

Binet Type Formula For The Sequence of Tetranacci Numbers by Alternate Methods

Binet Type Formula For The Sequence of Tetranacci Numbers by Alternate Methods DOI: 545/mjis764 Biet Type Formula For The Sequece of Tetraacci Numbers by Alterate Methods GAUTAMS HATHIWALA AND DEVBHADRA V SHAH CK Pithawala College of Eigeerig & Techology, Surat Departmet of Mathematics,

Διαβάστε περισσότερα

Solution Series 9. i=1 x i and i=1 x i.

Solution Series 9. i=1 x i and i=1 x i. Lecturer: Prof. Dr. Mete SONER Coordinator: Yilin WANG Solution Series 9 Q1. Let α, β >, the p.d.f. of a beta distribution with parameters α and β is { Γ(α+β) Γ(α)Γ(β) f(x α, β) xα 1 (1 x) β 1 for < x

Διαβάστε περισσότερα

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit

Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit Ordinal Arithmetic: Addition, Multiplication, Exponentiation and Limit Ting Zhang Stanford May 11, 2001 Stanford, 5/11/2001 1 Outline Ordinal Classification Ordinal Addition Ordinal Multiplication Ordinal

Διαβάστε περισσότερα

CHAPTER 103 EVEN AND ODD FUNCTIONS AND HALF-RANGE FOURIER SERIES

CHAPTER 103 EVEN AND ODD FUNCTIONS AND HALF-RANGE FOURIER SERIES CHAPTER 3 EVEN AND ODD FUNCTIONS AND HALF-RANGE FOURIER SERIES EXERCISE 364 Page 76. Determie the Fourier series for the fuctio defied by: f(x), x, x, x which is periodic outside of this rage of period.

Διαβάστε περισσότερα

Μηχανική Μάθηση Hypothesis Testing

Μηχανική Μάθηση Hypothesis Testing ΕΛΛΗΝΙΚΗ ΔΗΜΟΚΡΑΤΙΑ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΡΗΤΗΣ Μηχανική Μάθηση Hypothesis Testing Γιώργος Μπορμπουδάκης Τμήμα Επιστήμης Υπολογιστών Procedure 1. Form the null (H 0 ) and alternative (H 1 ) hypothesis 2. Consider

Διαβάστε περισσότερα

Lecture 34 Bootstrap confidence intervals

Lecture 34 Bootstrap confidence intervals Lecture 34 Bootstrap confidence intervals Confidence Intervals θ: an unknown parameter of interest We want to find limits θ and θ such that Gt = P nˆθ θ t If G 1 1 α is known, then P θ θ = P θ θ = 1 α

Διαβάστε περισσότερα

On Inclusion Relation of Absolute Summability

On Inclusion Relation of Absolute Summability It. J. Cotemp. Math. Scieces, Vol. 5, 2010, o. 53, 2641-2646 O Iclusio Relatio of Absolute Summability Aradhaa Dutt Jauhari A/66 Suresh Sharma Nagar Bareilly UP) Idia-243006 aditya jauhari@rediffmail.com

Διαβάστε περισσότερα

A study on generalized absolute summability factors for a triangular matrix

A study on generalized absolute summability factors for a triangular matrix Proceedigs of the Estoia Acadey of Scieces, 20, 60, 2, 5 20 doi: 0.376/proc.20.2.06 Available olie at www.eap.ee/proceedigs A study o geeralized absolute suability factors for a triagular atrix Ere Savaş

Διαβάστε περισσότερα

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in

Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in Nowhere-zero flows Let be a digraph, Abelian group. A Γ-circulation in is a mapping : such that, where, and : tail in X, head in : tail in X, head in A nowhere-zero Γ-flow is a Γ-circulation such that

Διαβάστε περισσότερα

Probability theory STATISTICAL MODELING OF MULTIVARIATE EXTREMES, FMSN15/MASM23 TABLE OF FORMULÆ. Basic probability theory

Probability theory STATISTICAL MODELING OF MULTIVARIATE EXTREMES, FMSN15/MASM23 TABLE OF FORMULÆ. Basic probability theory Lud Istitute of Techology Cetre for Mathematical Scieces Mathematical Statistics STATISTICAL MODELING OF MULTIVARIATE EXTREMES, FMSN5/MASM3 Probability theory Basic probability theory TABLE OF FORMULÆ

Διαβάστε περισσότερα

Uniform Estimates for Distributions of the Sum of i.i.d. Random Variables with Fat Tail in the Threshold Case

Uniform Estimates for Distributions of the Sum of i.i.d. Random Variables with Fat Tail in the Threshold Case J. Math. Sci. Uiv. Tokyo 8 (2, 397 427. Uiform Estimates for Distributios of the Sum of i.i.d. om Variables with Fat Tail i the Threshold Case By Keji Nakahara Abstract. We show uiform estimates for distributios

Διαβάστε περισσότερα

6.3 Forecasting ARMA processes

6.3 Forecasting ARMA processes 122 CHAPTER 6. ARMA MODELS 6.3 Forecasting ARMA processes The purpose of forecasting is to predict future values of a TS based on the data collected to the present. In this section we will discuss a linear

Διαβάστε περισσότερα

Solutions to Exercise Sheet 5

Solutions to Exercise Sheet 5 Solutions to Eercise Sheet 5 jacques@ucsd.edu. Let X and Y be random variables with joint pdf f(, y) = 3y( + y) where and y. Determine each of the following probabilities. Solutions. a. P (X ). b. P (X

Διαβάστε περισσότερα

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3

Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3 Lecture 2: Dirac notation and a review of linear algebra Read Sakurai chapter 1, Baym chatper 3 1 State vector space and the dual space Space of wavefunctions The space of wavefunctions is the set of all

Διαβάστε περισσότερα

Example Sheet 3 Solutions

Example Sheet 3 Solutions Example Sheet 3 Solutions. i Regular Sturm-Liouville. ii Singular Sturm-Liouville mixed boundary conditions. iii Not Sturm-Liouville ODE is not in Sturm-Liouville form. iv Regular Sturm-Liouville note

Διαβάστε περισσότερα

= λ 1 1 e. = λ 1 =12. has the properties e 1. e 3,V(Y

= λ 1 1 e. = λ 1 =12. has the properties e 1. e 3,V(Y Stat 50 Homework Solutions Spring 005. (a λ λ λ 44 (b trace( λ + λ + λ 0 (c V (e x e e λ e e λ e (λ e by definition, the eigenvector e has the properties e λ e and e e. (d λ e e + λ e e + λ e e 8 6 4 4

Διαβάστε περισσότερα

Homework 3 Solutions

Homework 3 Solutions Homework 3 Solutions Igor Yanovsky (Math 151A TA) Problem 1: Compute the absolute error and relative error in approximations of p by p. (Use calculator!) a) p π, p 22/7; b) p π, p 3.141. Solution: For

Διαβάστε περισσότερα

Supplemental Material: Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction

Supplemental Material: Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction Supplemetal Material: Scalig Up Sparse Support Vector Machies by Simultaeous Feature ad Sample Reductio Weizhog Zhag * 2 Bi Hog * 3 Wei Liu 2 Jiepig Ye 3 Deg Cai Xiaofei He Jie Wag 3 State Key Lab of CAD&CG,

Διαβάστε περισσότερα

EE512: Error Control Coding

EE512: Error Control Coding EE512: Error Control Coding Solution for Assignment on Finite Fields February 16, 2007 1. (a) Addition and Multiplication tables for GF (5) and GF (7) are shown in Tables 1 and 2. + 0 1 2 3 4 0 0 1 2 3

Διαβάστε περισσότερα

The Simply Typed Lambda Calculus

The Simply Typed Lambda Calculus Type Inference Instead of writing type annotations, can we use an algorithm to infer what the type annotations should be? That depends on the type system. For simple type systems the answer is yes, and

Διαβάστε περισσότερα

Problem Set 3: Solutions

Problem Set 3: Solutions CMPSCI 69GG Applied Information Theory Fall 006 Problem Set 3: Solutions. [Cover and Thomas 7.] a Define the following notation, C I p xx; Y max X; Y C I p xx; Ỹ max I X; Ỹ We would like to show that C

Διαβάστε περισσότερα

Gauss Radau formulae for Jacobi and Laguerre weight functions

Gauss Radau formulae for Jacobi and Laguerre weight functions Mathematics ad Computers i Simulatio 54 () 43 41 Gauss Radau formulae for Jacobi ad Laguerre weight fuctios Walter Gautschi Departmet of Computer Scieces, Purdue Uiversity, West Lafayette, IN 4797-1398,

Διαβάστε περισσότερα

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutions to Problems on Matrix Algebra

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutions to Problems on Matrix Algebra MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutios to Poblems o Matix Algeba 1 Let A be a squae diagoal matix takig the fom a 11 0 0 0 a 22 0 A 0 0 a pp The ad So, log det A t log A t log

Διαβάστε περισσότερα

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required) Phys460.nb 81 ψ n (t) is still the (same) eigenstate of H But for tdependent H. The answer is NO. 5.5.5. Solution for the tdependent Schrodinger s equation If we assume that at time t 0, the electron starts

Διαβάστε περισσότερα

Tired Waiting in Queues? Then get in line now to learn more about Queuing!

Tired Waiting in Queues? Then get in line now to learn more about Queuing! Tired Waitig i Queues? The get i lie ow to lear more about Queuig! Some Begiig Notatio Let = the umber of objects i the system s = the umber of servers = mea arrival rate (arrivals per uit of time with

Διαβάστε περισσότερα

Solve the difference equation

Solve the difference equation Solve the differece equatio Solutio: y + 3 3y + + y 0 give tat y 0 4, y 0 ad y 8. Let Z{y()} F() Taig Z-trasform o both sides i (), we get y + 3 3y + + y 0 () Z y + 3 3y + + y Z 0 Z y + 3 3Z y + + Z y

Διαβάστε περισσότερα

Inertial Navigation Mechanization and Error Equations

Inertial Navigation Mechanization and Error Equations Iertial Navigatio Mechaizatio ad Error Equatios 1 Navigatio i Earth-cetered coordiates Coordiate systems: i iertial coordiate system; ECI. e earth fixed coordiate system; ECEF. avigatio coordiate system;

Διαβάστε περισσότερα

Section 8.3 Trigonometric Equations

Section 8.3 Trigonometric Equations 99 Section 8. Trigonometric Equations Objective 1: Solve Equations Involving One Trigonometric Function. In this section and the next, we will exple how to solving equations involving trigonometric functions.

Διαβάστε περισσότερα

Numerical Analysis FMN011

Numerical Analysis FMN011 Numerical Analysis FMN011 Carmen Arévalo Lund University carmen@maths.lth.se Lecture 12 Periodic data A function g has period P if g(x + P ) = g(x) Model: Trigonometric polynomial of order M T M (x) =

Διαβάστε περισσότερα

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science. Bayesian statistics DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Frequentist vs Bayesian statistics In frequentist

Διαβάστε περισσότερα

Fractional Colorings and Zykov Products of graphs

Fractional Colorings and Zykov Products of graphs Fractional Colorings and Zykov Products of graphs Who? Nichole Schimanski When? July 27, 2011 Graphs A graph, G, consists of a vertex set, V (G), and an edge set, E(G). V (G) is any finite set E(G) is

Διαβάστε περισσότερα

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β

3.4 SUM AND DIFFERENCE FORMULAS. NOTE: cos(α+β) cos α + cos β cos(α-β) cos α -cos β 3.4 SUM AND DIFFERENCE FORMULAS Page Theorem cos(αβ cos α cos β -sin α cos(α-β cos α cos β sin α NOTE: cos(αβ cos α cos β cos(α-β cos α -cos β Proof of cos(α-β cos α cos β sin α Let s use a unit circle

Διαβάστε περισσότερα

SUPPLEMENT TO ROBUSTNESS, INFINITESIMAL NEIGHBORHOODS, AND MOMENT RESTRICTIONS (Econometrica, Vol. 81, No. 3, May 2013, )

SUPPLEMENT TO ROBUSTNESS, INFINITESIMAL NEIGHBORHOODS, AND MOMENT RESTRICTIONS (Econometrica, Vol. 81, No. 3, May 2013, ) Ecoometrica Supplemetary Material SUPPLEMENT TO ROBUSTNESS, INFINITESIMAL NEIGHBORHOODS, AND MOMENT RESTRICTIONS (Ecoometrica, Vol. 81, No. 3, May 213, 1185 121) BY YUICHI KITAMURA,TAISUKE OTSU, ANDKIRILL

Διαβάστε περισσότερα

Supplementary Materials: Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent

Supplementary Materials: Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent Supplemetary Materials: Tradig Computatio for Commuicatio: istributed Stochastic ual Coordiate Ascet Tiabao Yag NEC Labs America, Cupertio, CA 954 tyag@ec-labs.com Proof of Theorem ad Theorem For the proof

Διαβάστε περισσότερα

Physical DB Design. B-Trees Index files can become quite large for large main files Indices on index files are possible.

Physical DB Design. B-Trees Index files can become quite large for large main files Indices on index files are possible. B-Trees Index files can become quite large for large main files Indices on index files are possible 3 rd -level index 2 nd -level index 1 st -level index Main file 1 The 1 st -level index consists of pairs

Διαβάστε περισσότερα

Congruence Classes of Invertible Matrices of Order 3 over F 2

Congruence Classes of Invertible Matrices of Order 3 over F 2 International Journal of Algebra, Vol. 8, 24, no. 5, 239-246 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.2988/ija.24.422 Congruence Classes of Invertible Matrices of Order 3 over F 2 Ligong An and

Διαβάστε περισσότερα

Second Order RLC Filters

Second Order RLC Filters ECEN 60 Circuits/Electronics Spring 007-0-07 P. Mathys Second Order RLC Filters RLC Lowpass Filter A passive RLC lowpass filter (LPF) circuit is shown in the following schematic. R L C v O (t) Using phasor

Διαβάστε περισσότερα

Second Order Partial Differential Equations

Second Order Partial Differential Equations Chapter 7 Second Order Partial Differential Equations 7.1 Introduction A second order linear PDE in two independent variables (x, y Ω can be written as A(x, y u x + B(x, y u xy + C(x, y u u u + D(x, y

Διαβάστε περισσότερα

Lecture 22: Coherent States

Lecture 22: Coherent States Leture : Coheret States Phy851 Fall 9 Summary memorize Properties of the QM SHO: A 1 A + 1 + 1 ψ (x) ψ (x) H P + m 1 X λ A + i P λ h H hω( +1/ ) [ π!λ] 1/ H x /λ 1 mω λ h ( A A ) P i ( A A ) X + H x λ

Διαβάστε περισσότερα

6.1. Dirac Equation. Hamiltonian. Dirac Eq.

6.1. Dirac Equation. Hamiltonian. Dirac Eq. 6.1. Dirac Equation Ref: M.Kaku, Quantum Field Theory, Oxford Univ Press (1993) η μν = η μν = diag(1, -1, -1, -1) p 0 = p 0 p = p i = -p i p μ p μ = p 0 p 0 + p i p i = E c 2 - p 2 = (m c) 2 H = c p 2

Διαβάστε περισσότερα

Areas and Lengths in Polar Coordinates

Areas and Lengths in Polar Coordinates Kiryl Tsishchanka Areas and Lengths in Polar Coordinates In this section we develop the formula for the area of a region whose boundary is given by a polar equation. We need to use the formula for the

Διαβάστε περισσότερα

Lecture 21: Properties and robustness of LSE

Lecture 21: Properties and robustness of LSE Lecture 21: Properties and robustness of LSE BLUE: Robustness of LSE against normality We now study properties of l τ β and σ 2 under assumption A2, i.e., without the normality assumption on ε. From Theorem

Διαβάστε περισσότερα

Srednicki Chapter 55

Srednicki Chapter 55 Srednicki Chapter 55 QFT Problems & Solutions A. George August 3, 03 Srednicki 55.. Use equations 55.3-55.0 and A i, A j ] = Π i, Π j ] = 0 (at equal times) to verify equations 55.-55.3. This is our third

Διαβάστε περισσότερα

Math221: HW# 1 solutions

Math221: HW# 1 solutions Math: HW# solutions Andy Royston October, 5 7.5.7, 3 rd Ed. We have a n = b n = a = fxdx = xdx =, x cos nxdx = x sin nx n sin nxdx n = cos nx n = n n, x sin nxdx = x cos nx n + cos nxdx n cos n = + sin

Διαβάστε περισσότερα

w o = R 1 p. (1) R = p =. = 1

w o = R 1 p. (1) R = p =. = 1 Πανεπιστήµιο Κρήτης - Τµήµα Επιστήµης Υπολογιστών ΗΥ-570: Στατιστική Επεξεργασία Σήµατος 205 ιδάσκων : Α. Μουχτάρης Τριτη Σειρά Ασκήσεων Λύσεις Ασκηση 3. 5.2 (a) From the Wiener-Hopf equation we have:

Διαβάστε περισσότερα

6. MAXIMUM LIKELIHOOD ESTIMATION

6. MAXIMUM LIKELIHOOD ESTIMATION 6 MAXIMUM LIKELIHOOD ESIMAION [1] Maximum Likelihood Estimator (1) Cases in which θ (unknown parameter) is scalar Notational Clarification: From now on, we denote the true value of θ as θ o hen, view θ

Διαβάστε περισσότερα

Inverse trigonometric functions & General Solution of Trigonometric Equations. ------------------ ----------------------------- -----------------

Inverse trigonometric functions & General Solution of Trigonometric Equations. ------------------ ----------------------------- ----------------- Inverse trigonometric functions & General Solution of Trigonometric Equations. 1. Sin ( ) = a) b) c) d) Ans b. Solution : Method 1. Ans a: 17 > 1 a) is rejected. w.k.t Sin ( sin ) = d is rejected. If sin

Διαβάστε περισσότερα