Vol13, No2 2002 Journal of Software 1000-9825/2002/13(02)0250-08,,, (, 110004) E-mail: {wanggr,yuge}@mailneueducn http://wwwneueducn :,,, : ; ; ; : TP311 : A,,,,, :(1),(2),,, [1],, [2],, : ; ;,, [3] 4 : (SP) (SE) (RD) (FP) (SP)SP,,, : 2001-04-20; : 2001-09-24 : ; ; : (1976),,,, ; (1975),,,, ; (1966),,,,,,,Web ; (1962),,,,,,,
: 251 (SE)SE,,, (RD)RD,, SP (FP)FP SE RD,,, 1 2 3 4 5 1 [4] 11 20 60, Michigan John Holland Darwin Mendel,,,,, ;,,,, :(1), (2), (3) ( ), (4) 3, 12 8 :SGA =(C, E, P 0, M, Φ, Γ, Ψ, T), C,E ;P 0 ;M ;Φ ;Γ ;Ψ ;T 13 (1), : (2), (3), :
252 Journal of Software 2002,13(2) (4) p c (),,, (5) p m,,, 2 4 21,,, 0, 0 1 5, 4 Steinbrunn, 0, Fig1 C5 C2C3 C1 C4 ( 0, 0, 5, 2, 0, 3, 0, 1, 4 ) Join tree corresponds to non-zero integer 1 N, (2N-1), (0,0,1,2,3,4,5,0,0), 22, : N, C i, S, AvailMem, SS Fcost(N,C i,s,availmem,ss), 23, 3 [5] 3 (1) 1 [5],,, (2),,
: 253,,, ( ) : } Function CrossOver( ) {, do{ Psize --, }while(psize>0) (p i,p i+1 ), do{ /*Psize */ NUM (NUM<2N-1), HasSubTreeFlag1=findsubtree(p i,num,subtree1), HasSubTreeFlag2=findsubtree(p i+1,num,subtree2), /*HasSubTreeFlag1 HashSubTreeFlag2 */ }while(!hassubtreeflag1!hassubtreeflag2), /* NUM subtree1,subtree2*/ exchange(p i,subtree1,p i+1,subtree2), (3),,, : Function mutation( ) { do { Msize --; p i, do{ /*Msize */ num1=rand( ); num2=rand( ); }while(iszero(p i,num1) iszero(p i,num2) num1==num2 ); /* */ exchange(p i,num1,num2); }while( Msize>0 ); } 24,,: L, pop_size, p c, p m, T, (1) L N,L=2N 1 (2) pop_size pop_size,,,pop_size, 20-100 (3) p c, p c p c =04 (4) p m,,
254 Journal of Software 2002,13(2),p m, 00001~01,, 01 (5) T :(a) ;(b), 3,,,, 4 (SP,SE,RD,FP),,, 2 Begin Randomly generate pop_size individuals as initial population Generate y new individuals by probabilistically selecting via genetic operators Generate 4*(pop_size+y) new individuals (v, x)(x ---scheduling) by using four kinds of heurists Compute the fitness for each individual in P(t), generate P(t+1) by selecting pop_size smallest fitness individuals no end yes exit Fig2 Flow chart of parallel schedule based on genetic algorithm 2 4,, :(1),(2) 4
: 255 41 P c P m pop_size, 3 4 (12345101520 ) 3 0406 08, P c 04,,P c 08,,,P c 06,, 04 08,P c,,p c,,,,p c 06 4 pop_size pop_size p c, pop_size 60 80, 40, 60 80,pop_size,,,,pop_size,,, Cost 790 785 780 775 770 Pc=04 Pc=06 Pc=08 765 1 2 3 4 5 10 15 20 Time(s) 765 Cost 825 815 805 795 785 775 1 2 3 4 5 10 15 20 Time(s) Pop_size=40 Pop_size=60 Pop_size=80 Fig3 Performance for different P c Fig 4 Performance for different pop_size 3 4 42 4, 8 5 8 6 7 1 4 5 2 3 1 2 3 4 5 8 6 7 2 1 4 36 57 8 Fig5 Left-Deep tree,right-deep tree and bushy tree 5 6~8,x,y 6 4 ( ),,,,,,,,,
256 Journal of Software 2002,13(2) Time 2500 2000 1500 1000 500 7 4, 3, 3, 3, 0, 16 32 48 64 100 120 Processors Fig6 Performance for bushy tree, 6,, 2 Sequance Sync Seg Fp 8,, 2,,,, Time 2500 2000 Sequance 1500 Sync 40000 1000 Seg 30000 Fp 20000 500 10000 0 16 32 48 64 100 120 0 Processors Time 70000 60000 50000 Sequance Sync Seg Fp 16 32 48 64 100 120 Processors Fig7 Performance of left-deep tree Fig8 Performance of right-deep tree 7 8 5, References: [1] Chen, MS, Yu, PS, Wu, KL Optimization of parallel execution for multi-join queries IEEE Transactions on Knowledge and Data Engineering, 1996,8(3):416428 [2] Steinbrunn, M, Moerkotte, G, Kemper, A Heuristic and randomized optimization for the join ordering problem VLDB Journal, 1997,6(3):191208 [3] Wilschut, AN, Flokstra, J, Apers, PMG Parallel evaluation of multi-join queries In: Michael, JC, Donovan, AS, eds Proceedings of the ACM-SIGMOD 95 San Jose, CA: Academic Press, 1995 115126 [4] Chen, MS, Yu, PS, Wu, KL Scheduling and processor allocation for parallel execution of multi-join queries In: Proceedings of the 8th International Conference on Data Engineering Arizona: ICS Press, 1992 5867 [5] Zhou Ming, Sun Shu-dong The Principle and Application of Genetic Algorithm Beijing: National Defense Industry Press, 1999 (in Chinese)
: 257 : [5], :, 1999 Parallel Query Optimization Techniques for Multi-Join Expressions Based on Genetic Algorithm CAO Yang, FANG Qiang, WANG Guo-ren, YU Ge (School of Information Science and Engineering, Northeastern University, Shenyang 110004, China) E-mail: {wanggr,yuge}@mailneueducn http://wwwneueducn Abstract: The parallel query optimization for multi-join expressions is one of the key factors to improve the performance of database systems In this paper, an approach to solve the problems of the parallel query optimization for multi-join expressions by adopting GA algorithms is proposed To improve the execution efficiency of the query processors, the authors exploit heuristics to seek the optimum parallel scheduling execution plan for multi-join expressions The detailed testing results and performance analysis are presented The experiment results show that the GA algorithm with heuristic knowledge is effective for parallel query processing of multi-joins, and plays an important role in improving the performance of database systems Key words: genetic algorithm; multi-join expression; query optimization; parallel scheduling Received April 20, 2001; accepted September 24, 2001 Supported by the Foundation for University Key Teacher; the Teaching and Research Award Program for Outstanding Young Teachers in Higher Education Institution; the Cross Century Excellent Young Teacher Foundation of the Ministry of Education of China