Duals of the QCQP and SDP Sparse SVM. Antoni B. Chan, Nuno Vasconcelos, and Gert R. G. Lanckriet

Duals of the QCQP and SDP Sparse SVM Anton B. Chan, Nuno Vasconcelos, and Gert R. G. Lanckret SVCL-TR 007-0 v Aprl 007

Duals of the QCQP and SDP Sparse SVM Anton B. Chan, Nuno Vasconcelos, and Gert R. G. Lanckret Statstcal Vsual Computng Lab Department of Electrcal and Computer Engneerng Unversty of Calforna, San Dego Aprl 007 Abstract Ths s the techncal report that accompanes the ICML 007 paper Drect Convex Relaxatons of Sparse SVM (Chan et al., 007). In ths report, we derve the dual problems for the SDP and QCQP relaxatons of the sparse SVM.

Author emal: abchan@ucsd.edu c Unversty of Calforna San Dego, 007 Ths work may not be coped or reproduced n whole or n part for any commercal purpose. Permsson to copy n whole or n part wthout payment of fee s granted for nonproft educatonal and research purposes provded that all such whole or partal copes nclude the followng: a notce that such copyng s by permsson of the Statstcal Vsual Computng Laboratory of the Unversty of Calforna, San Dego; an acknowledgment of the authors and ndvdual contrbutors to the work; and all applcable portons of the copyrght notce. Copyng, reproducng, or republshng for any other purpose shall requre a lcense wth payment of fee to the Unversty of Calforna, San Dego. All rghts reserved. SVCL Techncal reports are avalable on the SVCL s web page at http://www.svcl.ucsd.edu Unversty of Calforna, San Dego Statstcal Vsual Computng Laboratory 9500 Glman Drve, Mal code 0407 EBU, Room 55 La Jolla, CA 9093-0407

Dual of the QCQP Sparse SVM The prmal problem for the QCQP relaxaton of the sparse SVM s Problem mn w,ξ,b,t N t + C ξ s.t. y (x T w + b) ξ, =,..., N, ξ 0, w rt, w t. The QCQP prmal (Problem ) can be transformed nto an equvalent problem by settng t = s and w = u v, for u, v R d +, Problem mn u,v,ξ,b,s s + C ξ s.t. y (x T (u v) + b) ξ, =,..., N, ξ 0, T (u + v) rs, u v s, u 0, v 0. Note that there s an mplct constrant that s 0. Introducng the Lagrangan multplers (α, γ) R N +, (µ, η) R +, and (λ, θ) R d +, the Langrangan s L(u, v, ξ, b, s, α, µ, γ, η, λ, θ) = s + C ξ + α ( ξ y x T (u v) y b ) γ ξ + µ ( T u + T v rs ) + η ( (u v) T (u v) s ) λ T u θ T v = η s rµs + ξ (C α γ ) + α b α y α y x T (u v) + µ ( T u + T v ) + η (u v)t (u v) λ T u θ T v. To mnmze L wth respect to (u, v, ξ, b, s), we take the partal dervatves and set them to zero. For s we have s = ( η)s µ r = 0 s = µ r η, η

DUAL OF THE QCQP SPARSE SVM where the constrant on η follows from the mplct non-negatve constrant on s. For b and ξ we have b = α y = 0 α y = 0 and ξ = C α γ = 0 α + γ = C α C. The partal dervatves wth respect to the hyperplane parameters u and v are u v = α y x + µ + η(u v) λ = 0, = α y x + µ η(u v) θ = 0. Addng the two partal dervatves gves a constrant on λ and θ, u + v = µ λ θ = 0 λ + θ = µ λ µ θ µ, and subtractng the two partal dervatves gves the equaton for the hyperplane w = u v, u v = α y x + η(u v) λ + θ = 0 ( (u v) = η u = η q + v α y x + (λ θ) ) = η q where q = α y x + (λ θ).

3 Substtutng these condtons back nto L, we obtan the g objectve functon of the dual g(α, µ, γ, η, λ, θ) = = mn v rµ = = = = mn L(u, v, ξ, b, s, α, µ, γ, η, λ, θ) u,v,ξ,b,s α α y x T q η ( η) rµ η + ( ) ) η + µ ( T q + v + T v rµ ( η) + α η ( ) η q + v θ T v + η η qt q λ T ( q ) T (λ θ) q + µ η T q + η qt q η λt q + mn(µ T v λ T v θ T v) v rµ ( η) + α η qt q + η λ η θ + µ η T q η λt q + mn(µ λ θ) T v v rµ ( η) + rµ ( η) + α η qt q + η α η qt q, (µ λ θ) + mn (µ λ v θ)t v wth the constrants α y = 0, λ + θ = µ, 0 η, 0 α C, 0 θ µ, 0 λ µ, 0 µ. The second constrant and the non-negatve constrant on θ and λ mples that µ (λ θ) µ. Hence, we defne ν = (λ θ) whch leads to the dual problem of the QCQP-SSVM, Problem 3 max α,η,µ,ν α η α y x + ν rµ ( η)

4 DUAL OF THE QCQP SPARSE SVM s.t. α y = 0, 0 α C, =,..., N, µ ν j µ, j =,..., d, µ 0, 0 η.. Dual of QCQP-SSVM n RQC form The dual can be rewrtten as a rotated-quadratc-cone (RQC) program by boundng each of the quadratc terms, t α η y x + ν tη α y x + ν and s s( η) µ µ ( η) Hence, Problem 4 mn α,η,µ,ν,s,t s.t. t + r N s α α y = 0, 0 α C, =,..., N, tη α y x + ν s( η) µ µ ν j µ, j =,..., d, µ 0, 0 η.. Dual of QCQP-SSVM n SDP form Problem 3 can be rewrtten nto an SDP by boundng each of the two quadratc terms and usng the Schur complement lemma: ( t N ) T ( N ) α y x + ν α y x + ν 0 η

5 [ ηi q q T t ] 0 where q = α y x + ν, and s µ ( η) [ ] ( η) µ µ s 0 0 The dual QCQP-SSVM can then be wrtten as an SDP, Problem 5 mn α,η,µ,ν,s,t,q s.t. t + r N s α α y x + ν, α y = 0, q = [ ] [ ] ηi q ( η) µ q T 0, 0, t µ s 0 α C, =,..., N, µ ν j µ, j =,..., d, µ 0, 0 η. Dual of the SDP sparse SVM The prmal problem for the SDP relaxaton of the sparse SVM s Problem 6 mn W,w,b,ξ N tr(w ) + C s.t. y (x T w + b) ξ, =,..., N, ξ 0, ξ T W rtr(w ), W ww T 0 By ntroducng two matrces (U, V ) R d d + (d d matrces wth non-negatve entres) such that W = U V, the SDP-SSVM prmal (Problem 6) s equvalent to Problem 7 mn U,V,w,b,ξ N tr(u V ) + C ξ

6 DUAL OF THE SDP SPARSE SVM s.t. y (x T w + b) ξ, =,..., N, ξ 0, T (U + V ) r tr(u V ), (U V ) wwt 0 U j,k 0, V j,k 0, j, k =,..., d. Introducng the Lagrangan multplers (α, γ) R N +, µ R +, the matrx Λ R d d 0, and the matrces (η, θ) R d d +, the Lagrangan s L(U, V, w, b, ξ, α, γ, η, θ, µ, Λ) = tr(u V ) + C ξ + α ( ξ y x T w y b) γ ξ tr(ηu) tr(θv ) tr(λ(u V wwt )) + µ ( T (U + V ) rtr(u V ) ) = tr ((I Λ µri)(u V )) tr(ηu) tr(θv ) + µ tr((u + V ) T ) + ξ (C α γ ) + α α y x T w + wt Λw α y b The Lagrangan s mnmzed wth respect to (U, V, w, b, ξ) by takng the partal dervatves and settng them to zero. For b, ξ, and w we have b = α y = 0 α y = 0 ξ = C α γ = 0 α + γ = C α C w = α y x + Λw = 0 w = Λ α y x. The partal dervatves wth respect to the matrces U and V are U = (I Λ µri) η + µ T = 0 V = (I Λ µri) θ + µ T = 0

7 Addng the partal dervatves yelds a constrant on θ and η, U + V = η θ + µ T = 0 η + θ = µ T Subtractng the partal dervatves yelds a formula for the weght matrx, U V = (I Λ µri) η + θ = 0 Λ = ( µr)i η + θ The dual objectve functon s obtaned by substtutng these condtons nto L, g(α, γ, η, θ, µ, Λ) = mn L(U, V, w, b, ξ, α, γ, η, θ, µ, Λ) U,V,w,b,ξ = mn U,V tr ((η θ)(u V )) tr(ηu) tr(θv ) + tr((u + V )(η + θ)) + α α α j y y j x T Λ x j + α α j y y j x T Λ x j,j,j = α α α j y y j x T Λ x j,j + mn U,V tr (ηu θu ηv + θv ηu θv ) + tr((u + V )(η + θ)) = α α α j y y j x T Λ x j,j wth constrants α y = 0, η + θ = µ T, Λ = ( µr)i η + θ 0, 0 α C, =,..., N, 0 µ, 0 η j,k, 0 θ j,k, j, k =,..., d. The second constrant and the non-negatve constrants on η and θ mples that µ θ j,k η j,k µ, j, k =,..., d. Fnally, defnng ν = θ η, the dual problem of the SDP-SSVM s Problem 8 max α,λ,ν,µ α α α j y y j x T Λ x j j=

8 REFERENCES s.t. α y = 0, Λ = ( µr)i + ν 0, 0 α C, =,..., N, µ ν j,k µ, j, k =,..., d, µ 0.. Dual n SDP form The dual can be wrtten nto SDP form by boundng the quadratc term by t, and usng the Schur complement lemma (Boyd & Vandenberghe, 004). t,j α α j y y j x T Λ x j ( ) T ( ) t α y x Λ α y x [ ] Λ q q T t 0 0 where q = α y x. The dual can then be wrtten as an SDP, Problem 9 mn α,µ,λ,ν,q s.t. N t α α y = 0, q = α y x, Λ = ( µr)i + ν, [ ] Λ q q T 0, t 0 α C, =,..., N, µ ν j,k µ, j, k =,..., d, µ 0. References Boyd, S., & Vandenberghe, L. (004). Convex optmzaton. Cambrdge Unversty Press. Chan, A. B., Vasconcelos, N., & Lanckret, G. R. G. (007). Drect convex relaxatons of sparse SVM. Internatonal Conference on Machne Learnng.

SVCL-TR 007-0 v Aprl 007 Duals of the QCQP and SDP Sparse SVM Anton B. Chan