1000-9825/2002/13(06)1090-07 2002 Journal of Software Vol13, No6 1,2 1, 1 (,100080); 2 (, 475001) E-mail: {hxtan,lxzhou}@math08mathaccn http://wwwamssaccn :,,,, : ;OLAP(online analytical processing); ; : TP311 : A OLAP(online analytical processing) OLAP, OLAP,,,,,,,,,,,,, 1 FPUS(frequency per unit space) 2 FPUS 3 4 5 1 1: [1], :, (sales) 1 ALL 1( [2] ) : 2000-08-23; : 2001-01-15 : (2008100) : (1969 ),,,,,, ;(1938 ),,,,,
: 1091, (1) : u v,u v v u v u ; (2) DBDB DB, (P,S) Dimension Product : ProductId Catagory ALL Dimension Location : StoreId Area ALL,,,, Fig1 The dimension hierarchies 1 1 2, P,S,C,R,A ProductId,StoreId,Category,Area ALL,DB=, v v [3] 2( ) MD q MD q d d :{(l 1,R 1 ),(l 2,R 2 ),,(l d,r d )}, d MD,l i d i,r i l i l i, (l i,r i )l i l i =ALL, i C, A q q, β q q={(l 1,R 1 ),(l 2, R 2 ),,(l d,r d )}, β q ={l 1,l 2,,l d } Fig2 The lattice of multi-dimensional data in Example 1 : p q, q 2 1 p, p q, p q, β p β q, p q, P 1,P 2 A 1 q 1 ={(ProductId,{P 1,P 2 }),(Area,{A 1 })} A 1 3 S 1,S 2 S 3 3 P 1,P 2 q 2 ={(ProductId,{P 1,P 2 }),(StoreId,{S 1,S 2,S 3 })} q 1 q 2, Q q f (q) 3( ) Space, NP-, Q,, f (q)/ q f (q), q = β q d i=1 qr i / d i=1 ql i, d, ql i l i, qr i R i 1 (FPUS) : Space, Q; : V V:= ; WHILE Space>0 DO v:=q f (q)/ q q; IF Space v THEN V:=V {v};q:=q {v}; P, A P, R P, S C, R A, A C, S A, R A, S
1092 Journal of Software 2002,13(6) Space:=Space v ; ELSE Space:=0; ENDWHILE RETURN V FPUS O(nlog 2 n),fpus,, f (q)/ q 2 FPUS FPUS,,,,, B opt FPUS,, 3, Space=10,FPUS {v}v B(v)=400( 8) u, u B(u)=100+90k k,b(u) B(v) u(10,1) DB(100,1) v(10,4) B opt,b(u) B(v), u,, FPUS,FPUS (5,1) k (5,1), FPUS Of the digit pair of each view, the former indicates the view size and 4( ) v, the later indicates the query frequency v v,, v, Q(v) Fig3 Query set breaking the sub-view restriction 3, u Q(v), u,u v w u w w < v u Q(v), v=q 1 (u),,, 5( ) v, v v, F(v), F(v)= f (v)+ u Q(v) f (u) 6( ) FPUS V, / w V z, z :F(z)/ z f (w)/ w, FPUS, 7( ) V,v V, V, v : B(v,V )= Q 1 (v) f (v)+ u Q(v) ( Q 1 (u) u ) f (u) 8( ) v B(v) v DB : B(v)= DB f (v)+( DB v ) u Q(v) f (u) v v DB, v v Q(v)
: 1093 DB, v, [4], FPUS, [2] BPUS BPUS v, B(v,V )/ v, V, B opt,v, f= v /Space, BPUS B opt 1 1/e f, e 1 FPUS BPUS B F B B,, B F B B /2 : FPUS BPUS F B, BF=B F F B, v Q F (v) Q B (v), v, v BF,Q F (v) Q B (v) F BF,F 1,F 2 F 3, F 1 BF,F 2 B F,F 3 =F BF F 1 F 2 B F v B B (v) B F (v) S, S, B B (S)= v S B B (v),b F (S)= v S B F (v),bf B F B,BF,F 1,F 2 F 3 F, B B (B)=B B (BF)+ B B (B F), B F (F)=B F (BF)+B F (F 1 )+B F (F 2 )+B F (F 3 ), F 1,F 2,F 3 : v, V BF : v, B F (v)= DB f (v)+( DB v ) u Q(v) f (u) DB f (v) B F (F i )= v Fi B F (v) DB v Fi f (v), i {1,2,3} (1) Q B (v) F 1 Q F (v), u QB(v) F1 f (u) u QF(v) f (u), B B (v)= DB f (v)+( DB v ) u QB(v)f (u) = DB f (v)+( DB v ) u QB(v) F1f (u)+( DB v ) u QB(v) F1f (u) DB f (v)+( DB v ) u QF(v)f (u)+ DB Σ u QB(v) F1f (u) =B F (v)+ DB u QB(v) F1f (u), B B (BF)= v BF B B (v) v BF B F (v)+ DB v BF u QB(v) F1f (u)=b F (BF)+ DB v F1 f (v) (1) : B F: v, (2) (3) : B B (BF) B F (BF)+B F (F 1 ) (2) B B (v)= DB f (v)+( DB v ) u QB(v)f (u) DB f (v)+ DB u QB(v)f (u) = DB f (v)+ DB u QB(v) F2f (u)+ DB u QB(v) F2f (u) = DB F(v)+ DB u QB(v) F2f (u) B B (B F)= v B F B B (v) v B F F(v)+ DB v B F u QB(v) F2f (u) = v B F F(v)+ DB v F2 f (u) v B F F(v)+B F (F 2 ) (3)
1094 Journal of Software 2002,13(6) B B (B) B F (F)=B B (BF)+B B (B F) B F (BF) B F (F 1 ) B F (F 2 ) B F (F 3 ) DB v B F F(v) B F (F 3 ) DB v B F F(v) w F f (w) / w, B F v, F(v)/ v f (w)/ w, F(v) f (w)/ w v,, B B (B) B F (F) DB Σ v B F F(v) DB f (w)/ w v B F v (4) F v, f (v)/ v f (w)/ w, f (v) f (w)/ w v, B F (v) DB f (v) DB f (w)/ w v, B F (F)= v F B F (v) DB v F f (v) DB v F f (v)/ w v F v (5) BPUS FPUS Space, (4)~ (6) :B F (F) B B (B) B F (F), B F (F) B B (B)/2 v F v = v B v v B F v (6) 1,FPUS (1 1/e f )/2, e, f B B 1 1/e f [2], FPUS BPUS, PBUS, FPUS, PBUS 3,,,,, ( ),, 2 : q, Space, V;q V S:=Space; // V remove := ; // search:=true; // WHILE S< q AND search DO v:=v f (v)/ v ; IF f (v)/ v < f (q)/ q THEN S:=S+ v ; V remove :=V remove {v}; V:=V {v}; ELSE search:=false; ENDWHILE IF S q THEN V remove ;Space:=S; q; Space:=Space q ; V:=V {q}; ELSE V:=V V remove ;, q : (1), (V remove = ), ; (2),, q q, q
: 1095 4 FPUS, BPUS : (1) [5] : 4 D 0,D 1,D 2 D 3, 4,3,4,3, 1 Table 1 Count of the members of multi-dimensional data in each level 1 15M Levels Dimensions D 0 D 1 D 2 D 3 3 1 1 2 25 1 5 1 1 50 25 25 10 0 100 50 50 50, DB 500K,, CUBE (2),,, 90% 10%,, 80% 20% 10 000 10,,,, CUBE 0%~20% 4 Query costs (KB) 600 500 400 300 200 100 0 FPUS BPUS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Space of views/space of cube (%) Fig4, /Cube Query cost based on different evaluation algorithms 4 5,,,,,,, [2], O(kn 2 ) BPUS [6] B- ; [7] O(nlog 2 n) PBS; [8],
1096 Journal of Software 2002,13(6),,,,, References: [1] Agrawal, R, Gupta, A, Sarawagi, S Modeling multidimensional databases In: Gray, A, Larson, Per-Åke, eds ICDE 97, Proceedings of the 13th International Conference on Data Engineering Birmingham, UK: IEEE Computer Society Press, 1997 232~243 [2] Harinarayan, V, Rajaraman, A, Ullman, JD Implementing data cubes efficiently In: Jagadish, HV, Mumick, IS, eds SIGMOD 96, Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data Montreal: ACM Press 1996 205~216 [3] Shukla, A, Deshpande, P, Naughton, JF, et al Storage estimation for multi-dimensional aggregates in the presence of hierarchies In: Vijayaraman, TM, Buchmann, AP, Mohan, C, eds VLDB 96, Proceedings of the 22nd International Conference on Very Large Data Bases Bombay: Morgan Kaufmann Publishers, Inc, 1996 522~531 [4] Qi, Wen-wen, Xu, Bin, Tan, Hong-xing Selecting materialized views within data cubes Journal of He nan University (Natural Science edition), 2001,31(1):20~25 (in Chinese) [5] Deshpande, PM, Ramasamy, K, Shukla, A, et al Caching multidimensional queries using chunks In: Haas, LM, Tiwary, A, eds SIGMOD 98, Proceedings of the ACM SIGMOD International Conference on Management of Data Seattle: ACM Press, 1998 259~270 [6] Gupta, H, Harinarayan, V, Rajaraman, A, et al Index Selection for OLAP In: Gray, A, Larson, Per-Åke, eds ICDE 97, Proceedings of the 13th International Conference on Data Engineering Birmingham, UK: IEEE Computer Society Press, 1997 208~219 [7] Shukla, A, Deshpande, P, Naughton, JF Materialized view selection for multidimensional datasets In: Gupta, A, Shmueli, O, Widom, J, eds VLDB 98, Proceedings of the 24th International Conference on Very Large Data Bases New York: Morgan Kaufmann Publishers, Inc, 1998 488~499 [8] Baralis, E, Paraboschi, S, Teniente, E Materialized view selection in a multidimensional database In: Jarke, M, Carey, MJ, Dittrich, KR, et al, eds VLDB 97, Proceedings of the 23rd International Conference on Very Large Data Bases Athens: Morgan Kaufmann Publishers, Inc, 1997 156~165 : [4],, ( ),2001,31(1):20~25 Dynamic Selection of Materialized Views of Multi-Dimensional Data TAN Hong-xing 1,2, ZHOU Long-xiang 1 1 (Institute of Mathematics, Academy of Mathematics and System Sciences, The Chinese Academy of Sciences, Beijing 100080, China); 2 (School of Computer Science, He nan University, Kaifeng 475001, China) E-mail: {hxtan,lxzhou}@math08mathaccn http://wwwamssaccn Abstract: A novel method is proposed to select materialized views of multi-dimensional data called dynamic selection The idea of dynamic selection is that the system is in charged of collecting the queries to obtain their distribution The set of materialized views is adjusted dynamically according to the queries distribution The method is given in detail including the algorithm of single-step selection and the instant adjusting method It is also proved that under certain constraints, the performance of the single-step algorithm is guaranteed to be no worse than that of the optimal one by a certain bound The experimental results show that the dynamic selection is more effective than other solutions Key words: materialized view; OLAP (online analytical processing); multi-dimensional data; data warehousing Received August 23, 2000; accepted January 15, 2001 Supported by the National Natural Science Foundation of China under Grant No2008100