24 5 Vol. 24 No. 5 Cont rol an d Decision 2009 5 May 2009 : 100120920 (2009) 0520738205 1a, 2, 1b (1. a., b., 239012 ; 2., 230039) :,,.,.,. : ; ; ; : TP181 : A Quick algorithm f or computing core attribute GE H ao 1a, L I L ong2shu 2, YA N G Chuan2j i an 1b (1a. Department of Electronic and Information Engineering, 1b. Department of Computer Science, Chuzhou University, Chuzhou 239012, China ; 2. School of Computer Science, Anhui University, Hefei 230039, China. Correspondent : GE Hao, E2mail : togehao @126. com) Abstract : The algorithms for computing the core have following shortcoming : The core acquired f rom these algorithms is not the core based on positive region, and the time complexity and space complexity are not good. Therefore, a new approach for computing core is provided and proved that the core is equivalent to the core based on positive region. The partition of the equivalence class is the key step for computing the core. An efficient algorithm for computing the equivalence class is designed with the approach based on radix sort by using distributing counting. On the foundation, the quick algorithm for computing the core is designed. The experimental result shows that the algorithm is correct and efficient. Key words : Rough set ; Equivalence class ; Positive region ; Core attribute 1 [ 1 ] Pawlak 1982.,,.. Hu [2 ] Skowron [3 ],, O( C U 2 ). [4 ] Hu,, O( C U 2 ). [5 ],,:,Hu. [6 ], O( C 2 U log U ), O( U ),,.,, [7210 ] : [7 ],, O( C U 2 ) ; [8 ], max{ O( C U log U ), O( C U POSC ( D) ) } O( C U POSC ( D) ) ; [9 ],, max{ O( C U/ C 2 ), O( C U ) } max{ O( U ), O( C U/ C 2 ) }. [7210 ] : 2008205212 ; : 2008207209. : (050420204) ; ( KJ2008B117). : (1976 ),,,,, ; (1956 ),,,,,.
5 : 739,,,,..,,., U/ C, U/ C O ( C U ).,O( C 2 U ), O( U ). 2 1 [11 ] S = (U, A,V, f ). :U,,U = { x1, x2,, x n} ; A, A = { a1, a2,, am, d}, A 2, A = C D C D =, C, D, D D = { d} ;V,V = { V a1,v a2,,v am, V d} ; f f :U A V, Πa A, x U, f ( x, a) V a. 2 [11 ] S = (U, A, V, f ), R Α A, ind ( R) = { ( x i, x j ) f ( x i, b) = f ( x j, b), Πb R} S.,, x [ x ]ind( R) [ x ] R. R U U/ ind ( R) U/ R. 3 [11 ] S = (U, A, V, f ), R Α A, R- X = { x U [ x ]ind( R) Α X} X R ; R - X = { x U [ x ]ind( R) X } X R ; POSR ( X) = R- X X R. 4 [11 ] S = (U, C D, V, f ), P Α C, D P POSP ( D), POS P ( D) = P - ( X). X U/ D D P U ind ( P) D. 5 [11 ] S = (U, C D, V, f ), a C, POSC ( D) = POSC- { a} ( D), a C D ;, a C D. C C D, Core ( C). 6 [11 ] S = (U, C D, V, f ),x i, x j U, i j, f ( x i, C) = f ( x j, C) f ( x i, D) f ( x j, D),, x i x j ( ) ;,. 3, Hu ;, Hu.,.. 7 S = (U, C D,V, f ), a C, GCore ( C) GCore ( C) = { a a C}, ConSet ( C - { a} ) > ConSet ( C) ; <, ot herwise. ConSet ( C) = { x i x i U, ϖ x j, f ( x i, C) = f ( x j, C) f ( x i, D) f ( x j, D) }., ConSet ( C), ConSet ( C) ConSet ( C). 7,ai,,ai ;, ai. 1 S = (U, C D, V, f ), Core ( C) = GCore ( C). U/ C = { X1, X2,, X n},u/ D = { Y1, Y2,, Y m}. 1) GCore ( C) Α Core ( C). Πa GCore ( C),7, a, ConSet. x ConSet ( C), x ConSet ( C - { a} ),y [ x ] C y [ x ] C- { a}, f ( x, D) f ( y, D). x Y q, y Y p, y Y q, y [ x ] C- { a} y Y q,x ( C - { a} ) - Y q, x POSC- { a} ( D). x S x Y q, x C- Y q, x POSC ( D), POSC ( D) POSC- { a} ( D). a Core ( C)., GCore ( C) Α Core ( C). 2) Core ( C) Α GCore ( C). Πa Core ( C), POSC ( D) POSC- { a} ( D),x [ x ] C Α Y k [ x ] C- { a} Y k. ϖ y, y [ x ] C- { a} y [ x ] C. f ( x, C - { a} ) = f ( y, C - { a} ) f ( x, D) f ( y, D)., a. x S.. z, z, x [ x ] C Α Y k f ( x, D) f ( z, D), x, z U/ D, z, x [ x ] C Α Y k, x S., a, ConSet ( C - { a} ) > ConSet ( C), a
740 24 GCore ( C). Core ( C) Α GCore ( C). 1) 2) Core ( C) 4 4. 1 = GCore ( C). 1,. U, C,., ;,. O( C U 2 ). 1 S = (U, A,V, f ), x i, x j U C Πa A, f ( x i, a) = f ( x j, a). 2, 1. 1,S C ; S,. [6 ], O( C U log U ) ; [9 ], O( C U )., C S. O( C U ), O( U ),. : S = (U, C D), C = { ai i = 1,, m}, D = { d},, S = { S i i = 1,, n}, S i m + 2, S i = ( x i, a1, a2,, am, d). : S i x i, S i a j i a j, S i d i. C S, ai S., ai, [1 e], 0 < e U ;, count Pos[0 e],count Pos ind ( ai), ind ( ai), count Po s S i., 2,count Pos, sorteds. 1 : S = (U, C D,V, f ),U = { x i i = 1,, n}, C = { ai i = 1,, m} ; :U/ C. Step1 : for i = 1 to C Do / / Step1. 1 : { count Pos : count Pos[0 e] = 0. Step1. 2 : ind ( ai),,count Pos. Step1. 3 : for j = U to 1 Do Step1. 3. 1 : { S j a i, count Pos S j pos; po s ; Step1. 3. 2 : S j sorteds Step1. 3. 3 : [ S j a i ]ind( a i ) () ; } ) Step2 : s = 1, E1 } / / end_for_ i Step3 : for i = 2 to U Do } = { x1 } ; / / s {if ( x i x i- 1 C t hen Es = Es { x i} ; else s = s + 1, Es = { x i} ; Step4 : E( U/ C) s. 1, Step1 O( e) + O( U ) + O( U ) = O( U ). O( C ), Step1 O( C U ), Step3 O( U ). Step1, count Pos U + 1, Step3 sorteds 2 U., 2 O( U ). 1 O( C U ) + O( U ) = O( C U ), O( U ). 4. 2 7 1, : 2 :S = (U, A,V, f ). : U ; A, A = C D C D = ; C ; D. E s. : Core ( C). Step1 : Core ( C) =, ConSet ( C) =. Step2 : 1,C S, Step3 : count Po s,, count Pos. Step4 : for j = 1 to s Do if ( ( Ej x t x k x t d
5 : 741 x k d) t hen ConSet ( C) = ConSet ( C) Ej. Step5 : for i = 1 to C Do Step5. 1 : { 1,C - { ai} S, E s ; Step5. 2 : count Pos,, count Pos ; x k d) Step5. 3 : ConSet ( C - { ai} ) Step5. 4 : for j = 1 to s Do - { ai} ) Ej ; = ; if ( Ej x t x k x t d t hen ConSet ( C - { ai} ) = ConSet ( C Step5. 5 : if ( ConSet ( C - { ai} ) > ConSet ( C) ) then Core ( C) = Core ( C) { ai} ; } / / end_for_ i. Step6 : Core ( C). Step7 :. 2, Step2 O( C U ), Step3 Step4 O( U ). Step5. 1 O( C U ), Step5. 2 Step5. 4 O( U ), Step5 O( C U ) + 2O( U ) = O( C U ), Step5 C, Step5 O( C 2 U )., 2 O( C U ) + 2O( U ) + O( C 2 U ) = O( C 2 U ), O( U ). 4. 3 Hu, O( C U 2 ),., O( C 2 U log U ) O( U ),. max{ O( C U log U ), O( C U POSC ( D) ) } O( C U POSC ( D) ) ; max{ O( C U/ C 2 ), O( C U ) } max{ O( U ), O( C U/ C 2 ) },, U, U/ C POSC ( D) C, UCI Car evaluation 1 728, C 6, U/ C = 972, POSC ( D) = 1 196, U, U/ C POSC ( D) C.,. 5 5. 1, [4 ] 2. 1 1, 5, C = { a, b, c}, D. 1 S1 U a b c D x1 1 0 1 1 x2 1 0 1 0 x3 0 0 1 1 x4 0 0 1 0 x5 1 1 1 1 2 1, x1 x2, x3 x4., ConSet ( C) = { x1, x2, x3, x4 } ; ConSet ( C - { a} ) = { x1, x2, x3, x4 }, ConSet ( C - { b} ) = { x1, x2, x3, x4, x5 }, ConSet ( C - { c} ) = { x1, x2, x3, x4 }., ConSet ( C - { b} ) > ConSet ( C), Core ( C) = { b}. 2 2 4, C = { a, b, c}, D. 2 S2 U a b c D x1 1 0 1 0 x2 0 0 1 1 x3 0 0 1 0 x4 1 1 1 1 2, ConSet ( C) = { x2, x3 } ; ConSet ( C - { a} ) = { x1, x2, x3 }, ConSet ( C - { b} ) = { x1, x2, x3, x4 }, ConSet ( C - { c} ) = { x2, x3 }., ConSet ( C - { a} ) > ConSet ( C) ConSet ( C - { b} ) > ConSet ( C), Core ( C) = { a, b}. 5. 2 [6 ]UCI 6, Petium 4 2. 8 GHz, RAM 512 M, EAB KF, 3. AL G1,AL G2 EAB KF, AL G3, AL G4,
742 24 3 5 U/ C POSC (D) / ms AL G1 AL G2 AL G3 AL G4 AL G5 [ 6 ] 4 14 14 14 0. 022 0. 018 0. 018 0. 019 0. 016 Patient data 8 90 66 66 0. 895 0. 831 0. 833 0. 834 0. 471 Flare data 12 323 174 289 11. 951 8. 437 9. 912 9. 657 5. 301 Balance data 4 625 248 625 32. 671 8. 516 26. 790 21. 046 0. 607 Monkey data 17 556 432 556 40. 316 24. 513 31. 819 19. 546 8. 962 Car evaluation data 6 1 728 972 1 196 292. 195 9. 650 19. 381 16. 054 2. 682 Led17 data 26 2 000 1 998 2 000 12595. 141 297. 028 943. 281 873. 651 57. 173 AL G5 ( 2). 3, AL G5 AL G1 AL G4,,AL G5,. 6, Hu.,,.,, U/ C, O ( C U ) O ( U )., O( C 2 U ) O ( U ).,. ( References) [ 1 ] Pawlak Z. Rough set s [ J ]. Int J of Computer and Information Science, 1982, 11 (5) : 3412356. [2 ] Hu X H, Cercone N. Learning in relational databases : A rough set approach [J ]. Computational Intelligence, 1995, 11 (2) : 3232337. [ 3 ] Skowron A, Rauszer C. The discernibility matrices and f unctions in information systems [ C ]. Intelligent Decision Support2handbook of Applications and Advances of the Rough Sets Theory. Dordrecht : Kluwer Academic Publisher, 1991 : 3312362. [4 ],. [J ]., 2002, 30 (7) : 108621088. ( Ye D Y, Chen Z J. A new discernibility matrix and the computation of a core [ J ]. Acta Electronica Sinica, 2002, 30 (7) : 108621088. ) [5 ]. [J ]., 2003, 26 (5) : 6112615. ( Wang G Y. Calculation methods for core attributes of decision table [ J ]. Chinese J of Computers, 2003, 26 (5) : 6112615. ) [6 ],,,. [J ]., 2003, 24 (11) : 19502 1953. ( Zhao J, Wang G Y, Wu Z F, et al. An efficient approach to computer feature core [ J ]. Mini2Micro Systems, 2003, 24 (11) : 159021593. ) [7 ],. [J ]., 2005, 26 (11) : 197521977. ( Yan D Q, Liu F F. Discernibility matrix and approximate quality in attribute reduction [ J ]. Mini2 Micro Systems, 2005, 26 (11) : 197521977. ) [ 8 ],. [J ]., 2004, 43 (5) : 8652868. ( Yang M, Sun Z H. Improvement of discernibility matrix and the computation of a core [ J ]. J of Fudan University, 2004, 43 (5) : 8652868. ) [9 ],,,. [J ]., 2006, 42 (6) : 426. (Xu Z Y, Yang B R, Song W, et al. Quick computing core algorithm based on discernibility matrix [ J ]. Computer Engineering and Applications, 2006, 42 ( 6) : 426. ) [10 ],. [J ]., 2007, 21 (8) : 8572862. ( Yang M, Yang P. Fast updating algorithm of computation of a core based on discernibility matrix [J ]. Control and Decision, 2007, 21 (8) : 8572862. ) [11 ],,,. [ M ]. :, 2001. (Zhang W X, Wu W Z, Liang J Y, et al. Theory and method of rough set [ M ]. Beijing : Science Press, 2001. )