BiCGStab 1 1 2 3 1 4 2 BiCGStab PBiCGStab BiCG CGS CGS PBiCGStab BiCGStab M PBiCGStab An improvement in preconditioned algorithm of BiCGStab method Shoji Itoh, 1 aahiro Katagiri, 1 aao Saurai, 2 Mitsuyoshi Igai, 3 Satoshi hshima, 1 Hisayasu Kuroda 4 and Ken Naono 2 An improved preconditioned BiCGStab algorithm (improved PBiCGStab) is proposed. ational preconditioned algorithm of CGS has been constructed, by applying the derivation procedure of the CGS to the preconditioned BiCG. In order to extend this approach to the BiCGStab, minimum residual part of the BiCGStab must be considered logically. his proposed algorithm is also more rational than the conventional typical PBiCGStab mathematically. Numerical results show advantages of this improved PBiCGStab. 1. Ax b (1) BiCGStab PBiCGStab Preconditioned BiCGStab method 17) PBiCGStab 1 Information echnology Center, he University of oyo 2 Central esearch aboratory, Hitachi, td. 3 SI Hitachi USI Systems Co., td. 4 Graduate School of Science and Engineering, Ehime University 8) 6) BiCGStab BiCGStab CGS 14) CGS PCGS BiCG PBiCG 2),10) CGS 9) PBiCGStab 117
BiCG CGS BiCGStab BiCG CGS 5),6) BiCGStab M Minimum esidual part CGS BiCGStab BiCGStab 2 PBiCG PCGS α β 3 BiCGStab PBiCGStab PBiCG 4 PBiCGStab 5 2. BiCG CGS (1) A x b (2) BiCG 2),10) BiCG CGS 14) M BiCGSAB 17) 1 α β BiCG CGS 1 BiCG CGS BiCG CGS α β 1 2 PBiCG PCGS BiCGStab 1 1 2 1 2.1 K K U (1) A K K K Ã x b, Ã K x K x, AK, b K b (3) (3) (1) (3) Ã (3) (1) K K, K I (4) I K I, K K (5) Ã K A, x x, b K b, Ã AK, x Kx, b b (1) 1),3),18) 2.2 BiCG BiCG 2),10) A (1) (2) r 0 b Ax 0 r 0 b A x 0 r (A)r 0, (6) r (A )r 0 (7) 118
p P (A)r 0, (8) p P (A )r 0 (9) BiCG i, r j 0 (i j), (10) i, Ap j 0 (i j) (11) (6) (9) (z) (z) α zp (z), 0(z) 1, P (z) (z) + β P (z), P 0 (z) 1 3),10),14),15) p r + β p, p r + β p, r +1 r α Ap, r +1 r α A p (10)(11) u, v 4),15) (u, v) BiCG α β α, r, Ap, (12) +1 β, r +1, r. (13) BiCG BiCG (1) (2) Ă x [ b, [ A Ă, x A x x, b CG 13),18) Ă K [ K K K K [ Ă K Ă K [ Ã Ã x K x, b K b. [ K K AK K [ [ b b K K A K, (14) K p r [ [ K p p K p, r K r K p K r K r (1) (2) BiCG PBiCG r K K K r K b ( K b ( K AK ) (Kx), A K (15) ) ( K x ). (16) r r 1 r b Ax, (17) r b A x. (18) i, r j K r i K r j i K r j 0 (i j), (19) p i, Ã p j K p i, ( ) ( ) K AK Kp j i, Ap j 0 (i j). (20) (12)(13) α PBiCG, r r p, Ã p K r p, Ap, (21) β PBiCG +1, r +1 r +1, r K r +1 r. K r (22) (21)(22) (4) (5) α PBiCG β PBiCG 2.3 CGS CGS BiCG A BiCG 14) (1)(2) (6)(7) (8)(9) BiCG α β 1 (1) r r K (b Ax) 5) 7). 119
, r (A )r 0, (A)r 0 0 2 (A)r 0, (23), Ap P (A )r 0, AP (A)r 0 0, AP 2 (A)r 0. (24) r CGS (A)r 2 0, p CGS P 2 (A)r 0 r r α BiCG β BiCG, Ap, r 0, rcgs 0, ApCGS +1, r +1 r, r 0, rcgs +1 0, rcgs α CGS, (25) β CGS (26) BiCG CGS α β CGS (6)(8) 2.3.1 PCGS CGS Ã K AK, x K x, b K b, p CGS K p CGS, r CGS K r CGS, r 0 K r 0 (27) CGS α 0 rcgs 0 Ã pcgs r 0 K rcgs r 0, ( ) ( ) K AK K pcgs 0 rcgs 0 (28) AK p CGS 1),3),17) CGS β (4)(5) (28) (27) K r K b ( ) ( K A K K x ) r K ( b A x ) (29) (18) K BiCG 1 α β ω PBiCG PCGS PBiCGStab ω (18) K r K b ( K A X ) ( X x ) (30) 1 (14) Ã (23)(24) α PCGS α PBiCG β PBiCG β PCGS (25)(26) 9) 2.3.2 PCGS PBiCG CGS Ã K AK, x K x, b K b, p CGS K p CGS, r CGS K r CGS, r 0 K r 0 (31) 9) K r K b ( K A K ) ( K x ) (18) Ã (31) (25)(26) α PBiCG, r 0 p, Ã p rcgs 0 Ã pcgs r 0 K rcgs r 0, (K 0, K r CGS AK 0, K Ap CGS )(K p CGS ) α PCGS (21) β PBiCG β PCGS 3. BiCGStab PBiCG PCGS PBiCGStab 1 1 X r K r X K K 9) 120
PBiCG PCGS (3)(4)(5) 5),6) PBiCGStab PBiCG PCGS α β PBiCGStab ω PBiCGStab 3.1 BiCGStab BiCGStab ω S (z) (1 ω z) (1 ω 2 z) (1 ω 0 z) (32) BiCGStab S (A ) s S (A )r 0 BiCG r (32) s s lc(s ) ( r lc( ) + d r + + d 1r 1 + ) d0r 0 (33) 1 lc leading coefficient lc( +1 ) lc( ) α, lc(s +1 ) lc(s ) ω BiCG (33) r (10) s, r lc(s ) r lc( ) + d r + lc(s ) lc( ), r + d 1 r 1 + d 0r 0, r 1 r c, r, r c s, r c S (A )r 0, (A)r 0 c 0, S (A) (A)r 0. (34) BiCG p r +β p (11), Ap r, Ap + β p, Ap, Ap (35) r i, Ap j p i β ip i, Ap j 0 (i j) s Ap (33) 1 d i (i 1,, 0) r i lc(s )/lc( ) r (10) (11), Ap c s, Ap c S (A )r 0, AP (A)r 0 c 0, AS (A)P (A)r 0. (36) r SAB S (A) (A)r 0, (37) p SAB S (A)P (A)r 0 (38) (34) (36) α BiCG, r r 0, Ap rsab r 0 ApSAB α SAB, (39) r r β BiCG 0, rsab +1 +1, r +1, r α ω r 0 rsab β SAB (40) BiCG BiCGStab α β 15) CGS (40) ω (As, s ) (As, As ) (41) (u, v) BiCG CGS u, v (41) BiCGStab r SAB +1 M r SAB +1 s ω As. (42) BiCGStab (37)(38) BiCGStab à K AK, x K x, b K b, p SAB K p SAB, s K s, r SAB K r SAB, r 0 K r 0 (43) 3),17) α 0, rsab 0, à psab r 0, K rsab r 0, ( ) ( ) K AK K psab 0 rsab 0 α PSAB AK p SAB (44) PCGS ω ) (à s, s ω (à s, à s (45) ) (4)(5) 121
(42) r SAB +1 s ω Ã s (46) (b) (l) (r) ( ) ( ) K r SAB +1 K s ω b K AK K s, (47) ( K r SAB +1 K s ω l K A ) ( ) K s, (48) ( r SAB +1 s ω ) r AK s (49) ω ( K ω b AK s, K s ) ( ), K AK s, K AK s ( ) K AK s, K s ω l (K AK s, K AK s ), ( ) AK s, s ω r (AK s, AK s ) (45) (47) (49) ω b ω l ω r ω ω (47) (49) r SAB +1 s ω ( AK ) s (50) ω b ω l ω r β PSAB ω 1 PBiCGStab ω PBiCGStab M ω b ω l ω r 3.2 BiCGStab ω (47) (49) r SAB +1 (5) 18) (1) (AK )(Kx) b. (51) BiCGStab 1),3),17),18) (43) p SAB p SAB, s s, r SAB r SAB, r 0 r 0. (52) BiCGStab 1 3) SAB Algorithm 1. BiCGStab : x 0, r 0 b Ax 0, r 0, r 0 0, e.g., r 0 r 0, β 0, For 0, 1, 2,, until convergence, Do: p r + β (p ω AK p ), α r 0, r r 0, AK p, s r α AK p, (AK s, s ) ω (AK s, AK s ), x +1 x + α K p + ω K s, r +1 s ω AK s, β α ω r 0, r +1 r 0, r, End Do Alg.1 K p K s 2 3.3 BiCGStab (1) (2) BiCG (14) (16) (K A )x K b (53) PBiCG (53) 1 1) 7) 7), 5) 6) 9) 122
2 BiCGStab r :AK :r b Ax K r :K A :r b A x r :K A :r K ( b A x ) r :A K :r b A x K r K b (K A )x (54) r K r (55) (18) (52) r r (56) (29) (55) (56) 2 A K (34) (36) S (A ) S (z) ω α PSAB β PBiCG β PSAB (55) α PBiCG CGS 6),9) CGS α β (51) BiCGStab p SAB Kp SAB, s s, r SAB r SAB, r 0 K r 0 (57) (39)(40) α PBiCG, r 0 p, Ã p rsab 0 Ã psab r 0 rsab r 0, (AK ) (Kp SAB ) 0 K r SAB 0 K Ap SAB α PSAB, (58) β PBiCG +1, r +1, r α ω r 0 rsab +1 r 0 rsab 0 K r SAB +1 0 K r SAB α ω α ω 0, rsab +1 0, rsab β PSAB (59) BiCGStab α β BiCG (21)(22) BiCGStab Algorithm 2. BiCGStab : x 0, r 0 b Ax 0, r 0, r 0 0, e.g., r 0 K r 0, β 0, For 0, 1, 2,, until convergence, Do: p K r + β ( p ω K Ap ), α r 0, K r r 0, K Ap, (60) s r α Ap, K s K r α K Ap, (61) (AK s, s ) ω (AK s, AK s ), x +1 x + α p + ω K s, r +1 s ω AK s, β α ω r 0, K r +1 r 0, K r, (62) End Do Alg.2 3 K s, K r, K Ap K s (61) K r K Ap (62) 0 (60) 4. PBiCGStab im Davis s collection 16) Matrix Maret 12) 1.0 (1) 123
is ibrary of Iterative Solvers for inear Systems 11) 1.1.2 is Maefile x 0 0 0, r 0 0 Alg. 1 r 0 r0 Alg. 2 r 0 K r 0 r 2 / b 2 1.0 10 2 r DE Precision 7400 Intel Xeon E5420, 2.5GHz, MEM:16GB Cent S (Kernel 2.6.18) Intel icc 10.1 IU(0) BiCGStab 3 N NNZ Conventional(Alg.1) Improved(Alg.2) (Iter.) log 10 () [sec (ime) 1 jpwh 991 Alg.1 Alg.2 IU(0)-BiCGStab 8) 9) cryg10000 olm5000 Alg.1 Alg.2 cryg2500 Alg.1 Alg.2 Iter PBiCG Algorithm s relative residual 2-norm (log scale) Algorithm s relative residual 2-norm (log scale) Algorithm s relative residual 2-norm (log scale) 10000 100 1 0.01 0.0001 1e-06 1e-08 1e-10 1e-12 1e-14 100 1 0.01 0.0001 1e-06 1e-08 1e-10 1e-12 1e-14 100 1 0.01 0.0001 1e-06 1e-08 1e-10 1e-12 cryg2500 0 50 100 150 200 250 300 350 2 Iteration number (cryg2500) cryg10000 Conventional Improved 0 200 400 600 800 1000 3 Iteration number (cryg10000) fs_760_3 Conventional Improved Conventional Improved 1 BiCGStab Alg.1 Alg.2 1e-14 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Iteration number 4 (fs 760 3) 124
3 Conventional(Alg.1) Improved(Alg.2) Matrix N NNZ Iter. ime Iter. ime cryg2500 2500 12349 314-7.88 7.46e-2 119-10.62 3.02e-2 cryg10000 10000 49699 No convergence 524-9.55 5.43e-1 fs 760 2 760 5739 102-12.07 9.60e-3 149-12.32 1.40e-2 fs 760 3 760 5816 1938-12.77 1.71e-1 1080-12.23 9.67e-2 jpwh 991 991 6027 Breadown 18-13.35 3.04e-3 memplus 17758 99147 376-12.21 1.00e0 342-12.00 9.10e-1 olm5000 5000 19996 No convergence 27-12.07 1.21e-2 raefsy3 21200 1488768 120-12.29 4.83e0 92-12.35 3.83e0 watt 2 1856 11550 144-12.38 3.05e-2 139-12.01 3.01e-2 Algorithm s relative residual 2-norm (log scale) 1 0.01 0.0001 1e-06 1e-08 1e-10 1e-12 1e-14 olm5000 0 20 40 60 80 100 5 Iteration number (olm5000) Conventional Improved PGPBiCG PBiCGStab(l) BiCGStab Alg.1 Alg.2 Xabclib 19) (S.I.) PCGS e B 21300007 21300017 5. BiCGStab PBiCG CGS PCGS PBiCG PCGS α β M ω M PBiCGStab PCGS BiCG CGS BiCGStab PBiCGStab M 1) Barrett,., et al., emplates for the solution of linear systems: Building Blocs for Iterative Methods, SIAM, (1994). emplates (1996). 2) Fletcher,., Conjugate Gradient Methods for Indefinite Systems, Numerical Analysis Dundee 1975, ed. by Watson, G., ecture Notes in Mathematics, 506, Springer-Verlag, pp.73 89 (1976). 3) (1996). 4) II (1997). 5) 2009 9 (2009). 6) 5 Vol.15 pp.171 174 2010 7) : 125
(ACS) Vol.3, No.2 pp.9 19 2010 8) Itoh, S. and Sugihara, M., Systematic performance evaluation of linear solvers using quality control techniques, Software Automatic uning From Concepts to State-of-the-Art esults (eds. Naono, K., eranishi, K., Cavazos, J. and Suda,.), pp. 135 152, Springer, 2010. 9) CGS Vol.2011-HPC-130, No.46, pp.1 10 (2011). 10) anczos, C., Solution of Systems of inear Equations by Minimized Iterations, J. of es. Nat. Bur. of Standards, 49, pp.33 53 (1952). 11) http://www.ssisc.org/lis/ 12) http://math.nist.gov/matrixmaret/ 13) BCG CGS 613 pp.135 143 (1987). 14) Sonneveld, P., CGS, A fast anczos-type solver for nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 10(1), pp. 36 52 (1989). 15) (2009). 16) http://www.cise.ufl.edu/research/sparse/matrices/ 17) Van der Vorst, Hen A., BI-CGSAB: A fast and smoothly converging variant of BI-CG for the solution of nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 13(2), pp. 631 644, 1992. 18) Van der Vorst, Hen A., Iterative Krylov Methods for arge inear Systems, Cambridge University Press, (2003). 19) Xabclib project: http://www.abc-lib.org/xabclib 126