ISSN 1000-9825, CODEN RUXUEW E-mail: os@iscasaccn Journal of Software, Vol17, No8, August 2006, pp1688 1697 http://wwwosorgcn DOI: 101360/os171688 Tel/Fax: +86-10-62562563 2006 by Journal of Software All rights reserved Messy GA 1,2+, 1,2, 1,3, 1, 1 1 (,100080) 2 (,100049) 3 (,100080) An Automated Approach for Structural Test Data Generation Based on Messy GA XUE Yun-Zhi 1,2+, CHEN Wei 1,2, WANG Yong-Ji 1,3, ZHAO Chen 1, WANG Qing 1 1 (Laboratory for Internet Software Technologies, Institute of Software, The Chinese Academy of Sciences, Beiing 100080, China) 2 (Graduate School, The Chinese Academy of Sciences, Beiing 100049, China) 3 (Laboratory of Computer Science, Institute of Software, The Chinese Academy of Sciences, Beiing 100080, China) + Corresponding author: Phn: +86-10-62565272, E-mail: yunzhi@inteciscasaccn, http://wwwiscasaccn Xue YZ, Chen W, Wang YJ, Zhao C, Wang Q An automated approach for structural test data generation based on Messy GA Journal of Software, 2006,17(8):1688 1697 http://wwwosorgcn/1000-9825/17/1688htm Abstract: Structural testing is one of the basic approaches for identifying test cases Because of complexity of programming languages and variety of applications, an efficient approach to automated generation of structural test data is to breed search iteratively by profiling of program execution Based on Messy GA, an automated approach for generating such data is proposed in this paper by solving F(X), a test coverage function of test data set It utilizes Messy GA s prominent feature that can optimize complicated problems without prior knowledge about schema arrangement in chromosomes, so that it can improve concurrency level of searching and test coverage Compared with other approaches based on GA, the experimental results for several typical programs and real-world applications show that it can generate higher quality test data more efficiently, and can be applied to larger applications Key words: : structural test test data test case automatic generation genetic algorithm sizable chromosome Messy GA,,, Messy GA, X F(X), Messy GA F(X),,, Supported by the National Hi-Tech Research and Development Program of China under Grant No2004AA1Z2100 ( (863)) the Program of the Chinese Academy of Sciences for One Hundred Talents under Grant NoBCK35873 ( ) Received 2005-07-07 Accepted 2005-12-01
: Messy GA 1689,, : Messy GA : TP311 : A [1 3], [4],,,,, [4],,, [5 7] [8,9], [10], [7] [8],, :,, [3,4,10 13],,, [4],,,, [3],, [13], [14],,, [15],, [4,13], [16],,, [17] (formal concept analysis), [18],,,,, [2],, F(X), F(X), F(X),,F(X), ( ), F(X), F(X), F(X), [2] F(X), Messy GA F(X),Messy GA F(X),, X,,,
1690 Journal of Software Vol17, No8, August 2006, 1 Messy GA 2 Messy GA, 3, 4 1 Messy GA Messy GA [16],, [19],Messy GA,,,, [20], 11,, Messy GA {(i 1,v 1 ) (i 2,v 2 ) (i k,v k ) (i n,v n )},,n,i k,v k,i k,,, {(1,0) (2,0) (4,0) (4,1)}, 1 0, 2 0, 3, 4 0 1, 00*0 00*1,,, [19], 1, : ( ), Messy GA, ( ) Messy GA, (cut operator) (splice operator) [19] : :,, :, 12 Messy GA : (1) µ, M, k, T, k P(0) (2) (primordial phase) P(0), (3) (uxtapositional phase) a) P(t), b) c) d), Messy GA :, ( ), ( [19] )
: Messy GA 1691,,, 2 Messy GA X, S, S(X), δ, Σ, Σ Σ=F(X,S,S(X),δ), S δ,,, Messy GA, [20], F(X,S,S(X),δ) TriType 3, 3, Java, 15 17 21 Messy GA, ( ),, (, ), C C= V 1,V 2,,V k, :k V i (, ), I I= C 1,C 2,,C n = V 11,V 12,,V 1k, V 21,V 22,,V 2k,, V n1,v n2,,v nk,,n, V 1,V 2, V n 1 TriType : 23,78,34,5, 87,, 2, 46,456,345,40,, 34,76,320,83,94, Arg A, 23,78,34,5, 23 87, a 78 34 34,456,320 5 87 :1), ( ),, [4] Chromosome 2 Chromosome 1 Chromosome 3, 2) Fig1 Individuals expression based on Messy GA 1 Messy GA, 22 476 156 23 Arg B 2 46 456 345 40 10 87 83 Arg C 34 76 320 83 94 0 67 610 Test input, QUEST/Ada [21], 1
1692 Journal of Software Vol17, No8, August 2006, ( ) TRUE FALSE,,, -, 1 -,,, Table 1 A possible coverage table as test criteria is C-D coverage 1 -, No Type TRUE FALSE 1 Condition 2 Condition 3 Decision 4 Condition : -, c, d, m, TRUE a, FALSE b, f a+ b+ m f = 2 c+ d ( ),, [4,21], [4] 23,, ( ) 2 Fig2 Employ cut operator & splice operator to produce new individual for TriType 2 TriType,, :,, α I i = C i1,c i2,,c in = V i11,v i12,,v i1k, V i21,v i22,,v i2k,, V in1,v in2,,v ink
: Messy GA 1693 p,1 p k, p α β : α β = p,, α I i : V i11,v i12,,v i1p, V i21,v i22,,v i2p,, V in1,v in2,,v inp, β I i : V i1(p+1),v i1(p+2),,v i1k, V i2(p+1),v i2(p+2),,v i2k,, V in(p+1),v in(p+2),,v ink, α β, α β α β I, k m, p,, q, I, I, α β α β I = p,, I, i α β I = q, I, I i : V i11,v i12,,v i1p,v 1(q+1),,V 1m, V i21,v i22,,v i2p,v 2(q+1),,V 2m,, V in1,v in2,,v inp,v n(q+1),,v nm, I : V 11,V 12,,V 1q,V i1(p+1),,v i1k, V 21,V 22,,V 2q,V i2(p+1),,v i2k,, V n1,v n2,,v nq,v in(p+1),,v ink ( ), ( ),,,,, 24 : 1, A-op1 2, A-op2 A-op1,,,, C= V 1,V 2,,V k, p, 0989, C = V 1,V 2,,V p 0989,,V k A-op2, ( ), A-op1,,A-op2 I= V 11,V 12,,V 1k, V 21,V 22,,V 2k,, V n1,v n2,,v nk A-op2, I = V 11,V 12,,V 1k,V 1(k+1), V 21,V 22,,V 2k,V 2(k+1),, V n1,v n2,,v nk,v n(k+1) ( ),, :, ( ), ( ),, 25 Messy GA, Messy GA,, : (1), [22]
1694 Journal of Software Vol17, No8, August 2006 (2), (3), (4), (5), P (6) P, (3) (7), :, (8), SUB (9) SUB, (10) SUB A-op1, (11) A-op2, (12) P, (6),, ( ) ( ),,,, 3 31, :,, Web 3,, 100 : 100, 10, 1000, 0001 -, 5, [4] 2( [4]) Table 2 Comparison of C-D coverage through data generated by various methods 2 C-D Algorithm Random (%) Simple GA (%) GADGET [4] (%) Messy GA (%) Binary search 80 70 100 100 Computing the median 100 100 100 100 Number of days between two dates 875 100 100 100 Insertion sort 100 929 100 100 Euclidean greatest common denominator 100 100 100 100 Angle classification 434 867 933 100 Sample program 1 443 726 906 981 Sample program 2 401 662 892 941 Sample program 3 541 729 965 100,, C-D, Messy GA, [4] GADGET, 32,, Messy GA TriType,
: Messy GA 1695-0001, 10, 1 000 10~200, 10, 5, 5 3(a) - 3(a) : ( 80), 100% ( 140), ([80,140]), 100%, ( [10,130] ),, (40,50,60,70,80), 933%, (554,468,424,326,261),, 100, 00001 0002, 00001, 5, 3(b) - 3(b), ( 00009), ( 00014), ([00009,00013]), : ([00001,00013]),, C-D coverage (%) 100 600 100 400 98 500 350 96 95 94 300 400 92 250 90 300 90 200 88 200 150 85 86 100 100 84 82 50 80 0 80 0 10 30 50 70 90 110 130150170190 01 03 05 07 09 11 13 15 17 19 Fig3 (a) Population size 33 (a) Best solution generation C-D coverage (%) (b) Aberrance ratio (10 3 ) (b) (10 3 ) The influence of population size and aberrance ratio on performance of the algorithm 3 TriType, 3, 31, [4] 4 C-D coverage (%) 100 90 80 70 60 50 Messy GA Simple GA GADGET Random Best solution generation Fig4 40 0 200 400 600 Generation For TriType, comparison of generation needed by Messy GA, GADGET, simple GA and Random 800 4 TriType,Messy GA,GADGET, 1000
1696 Journal of Software Vol17, No8, August 2006 4, C-D :,,(434%) (867%),, 5, 4 1000 GADGET (933%),, Messy GA, 4 Messy GA,, Messy GA,,,, X,,,,,,,,, : 1), 2),,,, 3) :, References: [1] Jorgenson PC Software Testing: A Craftsman s Approach 2nd ed Beiing: Machine Press, 2003 5 7 (in Chinese) [2] Pargas RP, Harrold MJ Test data generation using genetic algorithms Journal of Software Testing, Verification and Reliability, 1999,9(4):263 282 [3] Korel B Automated software test data generation IEEE Trans on Software Engineering, 1990,16(8):870 879 [4] Michael CC, McGraw GE, Schatz MA Generating software test data by evolution IEEE Trans on Software Engineering, 2001,27(12):1085 1110 [5] Sy NT, Deville Y Consistency techniques for interprocedural test data generation ACM SIGSOFT Software Engineering Notes, 2003,28(5):108 117 [6] Chen T, Tse T, Zhouz Z Semi-Proving: An integrated method based on global symbolic evaluation and metamorphic testing ACM SIGSOFT Software Engineering Notes, 2002,27(4):191 195 [7] 10Offutt A, Jin Z, Pan J The dynamic domain reduction procedure for test data generation Software Practice and Experience, 1999, 29(2):167 193 [8] Zhang J, Xu C, Wang X Path-Oriented test data generation using symbolic execution and constraint solving techniques In: He JF, Yang FQ, eds Proc of the Int l Conf on Software Engineering and Formal Methods Beiing: IEEE Computer Society Press, 2004 242 250 [9] DeMillo RA, Offutt AJ Constraint-Based automatic test data generation IEEE Trans on Software Engineering, 1991,17(12): 900 910
: Messy GA 1697 [10] Gupta N, Mathur AP, Soffa ML Automated test data generation using an iterative relaxation method ACM SIGSOFT Software Engineering Notes,1998,23(6):231 244 [11] Korel B Automated test data generation for programs with procedures ACM SIGSOFT Software Engineering Notes, 1996,21(3):209 215 [12] Gallagher MJ Narasimhau VL ADTEST: A test data generation suite for Ada software systems IEEE Trans on Software Engineering, 1997,23(8):473 484 [13] Michael C, McGraw G, Schatz MA Opportunism and diversity in automated software test data generation Technical Report RSTR-003-97-13, RST Corporation, 1997 1 17 [14] Melanie M An Introduction to Genetic Algorithms Boston: MIT Press, 1998 121 124 [15] Shi ZZ Knowledge Discovery Beiing: Tsinghua University Press, 2002 266 286 (in Chinese) [16] Berndt D, Fisher J, Joshson L Breeding software test cases with genetic algorithms In: Sprague RH, ed Proc of the Int l Conf on System Sciences Big Island: IEEE Computer Society Press, 2003 338a [17] Khor S, Grogono P Using a genetic algorithm and formal concept analysis to generate branch coverage test data automatically In: Grünbacher P, Wiels V, Stirewalt K, eds Proc of the Int l Conf on Automated Software Engineering Linz: IEEE Computer Society Press, 2004 346 349 [18] Berndt DJ, Watkins A Investigating the performance of genetic algorithm-based software test case generation In: Ramamoorthy CV, ed Proc of the Int l Symp on High Assurance Systems Engineering Tampa Florida: IEEE Computer Society Press, 2004 261 262 [19] Kargupta H The gene expression messy genetic algorithm In: Proc of the Int l Conf on Evolutionary Computation Nagoya: IEEE Computer Society Press, 1996 814 819 [20] Zaritsky A, Sipper M The preservation of favoured building blocks in the struggle for fitness: The puzzle algorithm IEEE Trans on Evolutionary Computation, 2004,8(5):443 455 [21] Deason WH, Brown DB, Chang KH, Cross II JH A rule-based software test data generator IEEE Trans on Knowledge and Data Engineering, 1991,3(1):108 117 [22] Harman M, Hu L, Hieros R, Wegener J, Sthamer H, Baresel A, Roper M Testability transformation IEEE Trans on Software Engineering, 2004,30(1):3 16 : [1] Jorgensen PC 2 :,20035 7 [15] :,2002266 286 (1979 ),,,,, (1967 ),,,,,CCF,, (1977 ),,,, (1964 ),,,,,CCF, (1962 ),,,,,CCF,,,,,,