Bayes, Bayes mult-armed bandt problem Bayes A Sequental Expermental Desgn based on Bayesan Statstcs for Onlne Automatc Tunng Re SUDA, Ths paper proposes to use Bayesan statstcs for software automatc tunng and a sequental expermental desgn based on Bayesan statstcs for onlne automatc tunng whch s equvalent to mult-armed bandt problem. From theoretcal consderatons and from experments, we show that the proposed method successfully combnes nformaton from model wth mprecson and that from emprcal data wth varance and realzes effcent automatc tunng... ATLAS 7) FFTW 3) Dpt. of Computer Scence, Grad. Schl. of Informaton Scence and Technology, the Unversty of Tokyo CREST/JST, CREST. Bayes Bayes Bayes
Bayes Bayes Bayes 3 4 Bayes WAPT007 5) () () (3) t () () (3) t Bayes Vuduc 6). Bayes Bayes Bayes Bayes. Bayes Bayes y μ μ 0 ± τ μ μ 0, τ N(μ 0,τ ) μ ( p(μ) = exp (μ μ ) 0) πτ τ y σ N(μ, σ ) μ y =(y,y,...,y n) ( ) (y μ) p(y μ) = exp (πσ ) n/ σ Bayes p(μ y)p(y) =p(y,μ)=p(y μ)p(μ) ( p(μ y) exp ( exp (y μ) (μ μ n) τ n (μ μ ) 0) σ τ ) (σ, τ, y, p(y) ) τn = σ κ n = σ κ n τ + n κ0μ0 + nȳ μ n = ȳ = y () κ 0 + n n p(μ), p(μ y) y n+ p(y n+ y) = p(y n+ μ)p(μ y)dμ μ y n+ y N(μ n,σn) σn = σ κ n+/κ n y N(μ, σ ) y μ, σ Inv-χ p(σ )= (ν ( ) 0/) ν 0/ (σ0) ν 0/ Γ(ν 0/) (σ ) exp ν0σ 0 ν 0/+ σ
μ σ N(μ 0,σ /κ 0) y n+ t μ n Bayes ). Bayes Bayes mult-armed bandt problem ) μ (k) k μ () μ () mn {μ () } w () y y μ () n () μ () = κn μ() + y κ n mn {μ () } w () = μ () + E(mn{μ () }) E(x) x w () k k k w (k) w (k) = μ (k) + E(mn{w (k ) }) () k E(x) () E w (k ) E w (k).3 k w (k) = μ (k) +(k )E(mn{μ (k ),μ (k) }) mn (3) mn k w (k) w (k) k mn y p (y) (3) w (k) = μ (k) +(k ) +(k ) δ δ μ (k ) p (y)dy μ (k) mn p (y)dy δ y = δ μ (k ) = μ (k) mn δ = κ nμ (k) κ mn n μ (k) p t.4 Bayes p (y) p (y) =π ( y μ (k) )
μ (k ) α μ (k ) = α y +( α )μ (k) α, π α >α a π (η)dη > a π (η)dη ( a <0) π = π, α = α 8) μ (k) <μ (k) w (k) < w (k) μ (k) μ (k) w (k) < w (k) μ (k) >μ (k) μ (k) μ (k) w (k) < w (k) μ (k) μ (k) w (k) > w (k) w (k) k μ (k) 0 μ (k) σ =0.0 μ () 0 =0.0, κ 0 =.5 w () 0 w () μ () w () 0, w() 4 n 3,,, w0/w 0.5 0-0.5 - w n=3 w n= w n= -.5 w n=0 w0 n=3 w0 n= w0 n= w0 n=0-0 0. 0.4 0.6 0.8 mu w0/w 0.5 0-0.5 - w n=3 w n= w n= -.5 w n=0 w0 n=3 w0 n= w0 n= w0 n=0-0 0. 0.4 0.6 0.8 mu w (k) 0 w (k) k =, k =3 Fg. Estmated costs w (k) 0 and w (k) ;left: k =, rght: k =3 0 n w () 0 w () w () 0 n μ () = μ () 0 (= 0) w () 0 = w () μ () > 0 w () 0 < w () 0 μ () 0 w () 0 > w () k =3 n n = w () 0 n =0 w () μ () =0.4 k =3 μ (3) =0.7 k = k =3 3.
3. N N B, C A = BC ABC- LbScrpt 4) u, u (u,u ) [, 8] [, ] [, 64] [, ] [, 3] [, 4] [, 6] [, 8] [, 8] [, 6] [, 4] [, 3] [, ] [, 64] [, ] [, 8] 576 N M 576 M N 8 56 5 ( ) M 000 3. Bayes N, u N/u ( ) N mod u ABCLbScrpt 5 ( ) -: - N/u : -(N/u )(N/u ): -(N/u )(N/u )u u : A -(N/u )(N/u )N: -(N/u )(N/u )Nu : B -(N/u )(N/u )Nu : C -(N/u )(N/u )Nu u : -(N/u )(N mod u ): -(N/u )(N mod u )u : A -(N/u )(N mod u )N:, C -(N/u )(N mod u )Nu : B -(N mod u ): -(N mod u )N: A -(N mod u )N : B C 5 N N 5 7 7 3.3 Bayes Bayes Bayes () N Sten
() σ N log σ log N ν 0 N log(ν 0 4) N μ 0 κ 0 τ σ κ 0 = σ /τ log κ 0 log N 3.4 4 M : 576 M : 7 M 3A, M 3B: 7 (3) M 3A M 3B () 000 M, M, M 3B M 3A M 3B M M 7 M 3B M M CoreDuo.3GHz 0 Bayes 5 3 N rr rr =(T M T opt)/t opt T M M T opt rr =0 rl rl =(t M t opt)/t opt t M M t opt rl 0 rl =0 3.5 3 000 4 CPU (GHz) M
selecton (relatve performance) 4.5 M 4 3.5 3.5.5 0 00 00 300 400 500 600 700 800 900000 teraton selecton (relatve performance) 4.5 4 3.5 3.5.5 0 00 00 300 400 500 600 700 800 900000 teraton M selecton (relatve performance) 4.5 M3B 4 3.5 3.5.5 0 00 00 300 400 500 600 700 800 900000 teraton M, M, M 3B Fg. Executons by M, M and M 3B relatve regret.0e+0.0e+00 M M M3A M3B relatve loss.0e+0.0e+00.0e-0 M M M3A M3B.0E-0.0E-0.0E-03.0E-0.0E-04 PentumM.8 CoreDuo.3 Opteron.6 Opteron. Xeon.0 Xeon.4 PentumM.0 MobPen4-.8 MobPen3-. Pentum4-3.4 Xeon3.0 Xeon3.8 CPU Power5-.9 Crusoe0.86 USparc3-.0 CoreDuo.3 PentumM.8 CoreDuo.3 Opteron.6 Opteron. Xeon.0 Xeon.4 PentumM.0 MobPen4-.8 MobPen3-. Pentum4-3.4 Xeon3.0 Xeon3.8 CPU Power5-.9 Crusoe0.86 USparc3-.0 CoreDuo.3 3 000 Fg. 3 Relatve regrets of the methods; 000 teratons 4 000 Fg. 4 Relatve losses of the methods; 000 teratons 3 M 3A M 3B M M M 3A M 3B M 3A M 3B M M M CoreDuo M M 3A, M 3B M 4 3 M, M 3B, M 3A, M M 3B M CoreDuo.3GHz Table Evaluatons wth dfferent numbers of teratons on CoreDuo.3GHz M M M 3A M 3B 00.405.. rr 000.006.358 0.40 0.38 0000 0.0.353 0.9 0.3 00.35 0.83 0.80 rl 000 0.0008.353 0.6 0. 0000 0.0005.353 0.7 0.06 M 00, 000, 0000 0000 CoreDuo.3GHz Pentum4 3.4GHz, 5 M 3A, M 3B rl M, M rl M 00
Pentum4 3.4GHz Table Evaluatons wth dfferent numbers of teratons on Pentum4 3.4GHz M M M 3A M 3B 00.8.8.8 rr 000 7.96 0.6 0.53 0. 0000 0.80 0. 0.04 0.05 00 0.09 0.04 0.03 rl 000 0.007 0.09 0.0 0.0 0000 0.006 0.09 0.0 0.00 3 Table 3 Computatonal tmes of the control methods per teraton M M M 3A M 3B CoreDuo.67e-6.e-4 4.69e-4.90e-3 Pentum4.0e-6.36e-4 5.73e-4.47e-3 4 Table 4 M 3A / M 3B M Matrx szes where M 3A or M 3B s advantageous over M M vs M 3A M vs M 3B 000 0000 000 0000 CoreDuo 65 60 0 00 Pentum4 30 45 30 70 3 Pentum4 3.4GHz 000 M 3A M 0.007 M 3A M 0.06 (0.06 0.007 = 4.e-4 5.73e-4.36e-4) N 30 4 4. Bayes Bayes Bayes Bayes CREST ULP-HPC: ) Berry, D.A., and Frstedt, B.: Bandt Problem, Chapman and Hall (985). ) Carln, B. P. and Lous, T. A.: Bayes and Emprcal Bayes Methods for Data Analyss, nd ed., Chapman and Hall (000). 3) Frgo, M. and Johnson, S.G.: FFTW: an adaptve software archtecture for the FFT, Proc. ICASSP 98, Vol. 3, pp. 38 384 (998). 4) Katagr, T., et. al: ABCLbScrpt: A drectve to support specfcaton of an auto-tunng faclty for numercal software, Parallel Computng, Vol. 3, No., pp. 9 (006). 5) Suda, R.: A Bayesan Method for Onlne Code Selecton: Toward Effcent and Robust Methods of Automatc Tunng, Proc. nd Int l Workshop on Automatc Performance Tunng (WAPT007), pp. 3 3 (007). 6) Vuduc, R., Demmel, J. W., and Blmes, J. A.: Statstcal models for emprcal search-based performance tunng, Int l J. Hgh Perf. Comp. Appl., Vol. 8, No., pp. 65 94 (004). 7) Whaley, R. C. and Dongarra, J. J.: Automatcally Tuned Lnear Algebra Software, Proc. SC98 (CD-ROM), (998). 8) http://olab.s.s.u-tokyo.ac.p/ re/proof.pdf