Sulementary materals for Statstcal Estmaton and Testng va the Sorted l Norm Małgorzata Bogdan * Ewout van den Berg Weje Su Emmanuel J. Candès October 03 Abstract In ths note we gve a roof showng that even though the number of false dscoveres and the total number of dscoveres are not contnuous functons of the arameters, the formulas we obtan for the false dscovery roorton (FD and the ower, namely, (B.3 and (B.4 n the aer Statstcal Estmaton and Testng va the Sorted l Norm are mathematcally vald. We recall that these formulas are derved from [, Theorem.5]. Consder the lnear model y Xβ + z, where X R n s the desgn matrx, β R s the unknown arameter of nterest and z R n s a vector of..d. standard Gaussan ndeendent of X and β. The lasso estmate ˆβ wth enalty λ > 0 s the soluton to mn b R y Xb l + λ b l. ( As n the man aer, we let ϕ V (x, y (x 0(y 0 and ϕ R (x, y (x 0 so that the number V of false dscoveres s equal to V ϕ V ( ˆβ, β, and the number R of dscoveres s equal to R ϕ R ( ˆβ, β. We can now state the man result n ths note, whch generalzes [, Theorem.5]. Theorem. Suose that X s an n Gaussan desgn matrx wth..d. N (0, n entres, β s are non-degenerate..d. random varables wth bounded second moment ndeendent of X. Below, Θ s a varable wth the same dstrbuton as β. As and n/ δ > 0, the lasso soluton ˆβ obeys FD ϕ V ( ˆβ, β max{ ϕ R( ˆβ, β, } where τ > 0 and α > α mn are the unque solutons to τ +, (η δ E ατ (Θ + τz Θ λ ( δ ( Θ + τz > ατ ατ. (Θ 0Φ( α ( Θ + τz > ατ, Here η s the soft-thresholdng oerator defned as η t (x sgn(x( x t + and Z s a standard Gaussan ndeendent of Θ. *Deartments of Mathematcs and Comuter Scence, Wrocław Unversty of Technology and Jan Długosz Unversty, oland IBM T.J. Watson Research Center, Yorktown Heghts, NY 0598, U.S.A. Deartment of Statstcs, Stanford Unversty, Stanford, CA 94305 Deartments of Statstcs and of Mathematcs, Stanford Unversty, Stanford, CA 94305
Before resentng the roof, we gve three lemmas. Lemma. Suose Σ ( Σ, Σ, Σ, Σ, s a ostve defnte matrx wth all egenvalues larger than or equal to, where Σ, s a scalar. Then we have Σ, Σ, Σ, Σ,. roof. The condton Σ I, where I s the dentty matrx wth the same sze as Σ, s equvalent to 0 Σ I. So by the Schur comlement roerty, the (, entry of Σ satsfes whch gves Σ, Σ, Σ, Σ, as desred. 0 < Σ (, (Σ, Σ, Σ, Σ,, Lemma. Suose Y s a -dmensonal vector dstrbuted as N (µ, Σ, where Σ σ I. For any ɛ (0,, there exsts a constant c c(ɛ > 0 such that for any h > 0, (at least ɛ comonents of Y are n ( h, h mn(, ch ɛ σ ɛ. roof. By Lemma, the varance of Y Y, where the subscrt denotes all the comonents excet the th, s larger than or equal to σ h. So we have (Y ( h, h Y mn(, πσ almost surely snce the normal densty functon φ s bounded by h π. Denote by ξ..d. Bernoull varables ndeendent of Y wth (ξ mn(, πσ. And denote by ζ (Y ( h, h. Snce (ζ ζ,..., ζ (ξ ζ,..., ζ, a.s., we have Usng smlar arguments we reach the concluson (ζ + ζ +... + ζ ɛ (ξ + ζ +... + ζ ɛ. (ξ + ξ +... + ξ k + ζ k+ +... + ζ ɛ (ξ + ξ +... + ξ k + ξ k+ + ζ k+ +... + ζ ɛ, whch holds for k,,...,. Therefore, (at least ɛ comonents of Y are n ( h, h (ζ + ζ +... + ζ ɛ (ξ + ζ +... + ζ ɛ (ξ + ξ + ζ 3 +... + ζ ɛ... (ξ + ξ +... + ξ ɛ. Hence, ( ξ ɛ ɛ ɛ ɛ ɛ ( n ( n ( +ɛ/ π ɛ/ h ɛ σ ɛ.
Lemma 3. In the same settng as Theorem, denote by A {,..., } the actve set of the lasso soluton ˆβ of (. Then there exsts a constant ρ > 0 such that ( A < ρ 0. roof. Defne ϕ(x, y mn( x,, whch s seudo-lschtz. It follows from [, Theorem.5] that n robablty. Note that mn( ˆβ, E mn( η ατ (Θ + τz, > 0 A mn( ˆβ,. So for any ρ < E mn( η ατ (Θ + τz, we have ( A < ρ 0. roof of Theorem. To go around the dscontnuty of ϕ V, we defne a seres of seudo-lschtz contnuous functons ϕ V,h (x, y ( Q(x/hQ(y/h, where Q(x max( x, 0 for h > 0. Therefore, by [, Theorem.5], n robablty. Snce so that for any ɛ > 0, ϕ V,h ( ˆβ, β Eϕ V,h (η ατ (Θ + τz, Θ ( ϕ V,h (x, y ϕ V (x, y (0 < x < h + (0 < y < h, (3 ( ϕ V ( ˆβ, β By the weak Law of Large Numbers we have So f we addtonally have for any ɛ > 0, we would obtan ϕ V,h ( ˆβ (, β > ɛ ( h 0 h 0 ( su (0 < ˆβ < h > ɛ ( + (0 < β < h > ɛ 0. (0 < β < h > ɛ (0 < ˆβ < h > ɛ 0 (4 ϕ V ( ˆβ, β h 0 ϕ V,h ( ˆβ, β h 0 Eϕ V,h (η ατ (Θ + τz, Θ Eϕ V (η ατ (Θ + τz, Θ, where the last equalty comes from alyng the domnated convergence theorem to ϕ V,h ϕ V. We rove (4 to comlete the roof of the theorem. Denote the actve set of the Lasso soluton by A {,..., }. Then artal KKT condtons gve X T A(y X A ˆβA λ A, where λ A s a vector wth each comonent beng λ or λ deendng the sgns of ˆβ A. Note that X T A X A s nvertble wth robablty one because A n. So we may wrte the soluton as ˆβ A (X T AX A (X T Ay λ A (X T AX A (X T AXβ λ A + (X T AX A X T Az.. (5 3
Now for any subset D {,..., } wth D n, and λ of length D wth each comonent beng ±λ, defne D (X T DX D (X T DXβ λ + (X T DX D X T Dz. If D A and λ λ A, D concdes wth ˆβ A. By Lemma 3, there s a constant ρ > 0 such that ( A < ρ o(. For any ɛ > 0, by the unon bound we have ( By [], we have Now we estmate (0 < ˆβ < h > ɛ ρ D mn(,n λ D ( D ( D ( < h > ɛ, σ max (X < δ / + + ɛ ( + σ max (X δ / + + ɛ + ( A < ρ. (6 (σ max (X δ / + + ɛ o(. (7 ( D ( D ( < h > ɛ, σ max (X < δ / + + ɛ. Condtonally on X, D s dstrbuted as Gaussan wth mean (X T D X D (X T D Xβ λ and covarance (X T D X D. On the event σ max (X < δ / + + ɛ, all the egenvalues of (X T D X D are larger than σ (δ / + + ɛ. So by Lemma, ( D ( ˆβ λ D ( < h > ɛ,σ max (X < δ / + + ɛ X (at least ɛ D comonents of D are n ( h, h, σ max (X < δ / + + ɛ X mn(, ch ɛ σ ɛ D mn(, ch ɛ σ ɛ ρ (ch ɛ σ ɛ ρ. Together wth (6 and (7 ths gves ( (0 < ˆβ < h > ɛ (ch ɛ σ ɛ ρ + o( + o( ρ D mn(,n λ D (ch ɛ σ ɛ ρ + o( (4c ρ h ρɛ σ ρɛ + o(, (8 whch roves (4 by choosng h suffcently small such that 4c ρ h ρɛ σ ρɛ <. Consequently (5 s also roved. Ths gves V Eϕ V (η ατ (Θ + τz, Θ. (9 Smlarly, defnng ϕ R,h (x, y Q(x/h, we can also establsh R Eϕ R (η ατ (Θ + τz, Θ. (0 4
Combnng (9 and (0 gves FD V Eϕ V (η ατ (Θ + τz, Θ max(r, Eϕ R (η ατ (Θ + τz, Θ (Θ 0Φ( α ( Θ + τz > ατ. As a byroduct of the roof of Theorem, we have Corollary. Wth the same settng as Theorem, the emrcal ower of the lasso soluton ˆβ of ( obeys roof of Corollary. We have ower #{ : ˆβ 0, β 0} β l0 R V ower #{ : β 0} (R V / #{ : β 0}/ ( Θ + τz > ατ Θ 0. Eϕ R(η ατ (Θ + τz, Θ Eϕ V (η ατ (Θ + τz, Θ (Θ 0 (Θ 0, Θ + τz > ατ (Θ 0 ( Θ + τz > ατ Θ 0. References [] M. Bayat and A. Montanar. The LASSO rsk for Gaussan matrces. IEEE Transactons on Informaton Theory, 58(4:997 07, 0. [] S. Geman. A t theorem for the norm of random matrces. The Annals of robablty, 8(:5 6, 980. 5