A Survey of Recent Clustering Methods for Data Mining (part 2)

Σχετικά έγγραφα
A Survey of Recent Clustering Methods for Data Mining (part 1)

ER-Tree (Extended R*-Tree)


Anomaly Detection with Neighborhood Preservation Principle


substructure similarity search using features in graph databases


Gemini, FastMap, Applications. Εαρινό Εξάμηνο Τμήμα Μηχανικών Η/Υ και Πληροϕορικής Πολυτεχνική Σχολή, Πανεπιστήμιο Πατρών

Text Mining using Linguistic Information

Αποθήκες Δεδομένων και Εξόρυξη Δεδομένων:

Research on model of early2warning of enterprise crisis based on entropy

Quick algorithm f or computing core attribute

Αποθήκες Δεδομένων και Εξόρυξη Δεδομένων


MIDI [8] MIDI. [9] Hsu [1], [2] [10] Salamon [11] [5] Song [6] Sony, Minato, Tokyo , Japan a) b)

Buried Markov Model Pairwise

Clustering. Αλγόριθµοι Οµαδοποίησης Αντικειµένων

2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems

[1] DNA ATM [2] c 2013 Information Processing Society of Japan. Gait motion descriptors. Osaka University 2. Drexel University a)

The Algorithm to Extract Characteristic Chord Progression Extended the Sequential Pattern Mining

[4] 1.2 [5] Bayesian Approach min-max min-max [6] UCB(Upper Confidence Bound ) UCT [7] [1] ( ) Amazons[8] Lines of Action(LOA)[4] Winands [4] 1

Twitter 6. DEIM Forum 2014 A Twitter,,, Wikipedia, Explicit Semantic Analysis,

User Behavior Analysis for a Large2scale Search Engine

3: A convolution-pooling layer in PS-CNN 1: Partially Shared Deep Neural Network 2.2 Partially Shared Convolutional Neural Network 2: A hidden layer o

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

Στοιχεία εισηγητή Ημερομηνία: 10/10/2017


Newman Modularity Newman [4], [5] Newman Q Q Q greedy algorithm[6] Newman Newman Q 1 Tabu Search[7] Newman Newman Newman Q Newman 1 2 Newman 3

HOSVD. Higher Order Data Classification Method with Autocorrelation Matrix Correcting on HOSVD. Junichi MORIGAKI and Kaoru KATAYAMA

1 n-gram n-gram n-gram [11], [15] n-best [16] n-gram. n-gram. 1,a) Graham Neubig 1,b) Sakriani Sakti 1,c) 1,d) 1,e)

Optimization, PSO) DE [1, 2, 3, 4] PSO [5, 6, 7, 8, 9, 10, 11] (P)

Big Data/Business Intelligence

Ανακάλυψη κανόνων συσχέτισης από εκπαιδευτικά δεδομένα

BIRCH: : An Efficient Data Clustering Method for Very Large Databases

ΔΙΠΛΩΜΑΤΙΚΕΣ ΕΡΓΑΣΙΕΣ ΠΜΣ «ΠΛΗΡΟΦΟΡΙΚΗ & ΕΠΙΚΟΙΝΩΝΙΕς» OSWINDS RESEARCH GROUP

Vol. 31,No JOURNAL OF CHINA UNIVERSITY OF SCIENCE AND TECHNOLOGY Feb


Bayesian Discriminant Feature Selection

Probabilistic Approach to Robust Optimization

CSJ. Speaker clustering based on non-negative matrix factorization using i-vector-based speaker similarity

Efficient and Effective Clustering Methods for Spatial Data Mining (Αποδοτικές και αποτελεσματικές μέθοδοι ομαδοποίησης για εξόρυξη χωρικών δεδομένων)

ΔΙΠΛΩΜΑΤΙΚΕΣ ΕΡΓΑΣΙΕΣ

An Efficient Calculation of Set Expansion using Zero-Suppressed Binary Decision Diagrams

443020,,., 61, / : +7 (846)

Διδάσκοντες: Μαρία Χαλκίδη

ΠΤΥΧΙΑΚΗ ΕΡΓΑΣΙΑ. ΤΕΧΝΟΛΟΓΙΚΟ ΕΚΠΑΙΔΕΥΤΙΚΟ ΙΔΡΥΜΑ ΣΕΡΡΩΝ ΣΧΟΛΗ ΤΕΧΝΟΛΟΓΙΚΩΝ ΕΦΑΡΜΟΓΩΝ Τμήμα Πληροφορικής και Επικοινωνιών

Area Location and Recognition of Video Text Based on Depth Learning Method

High order interpolation function for surface contact problem

Nov Journal of Zhengzhou University Engineering Science Vol. 36 No FCM. A doi /j. issn

ΔΙΠΛΩΜΑΤΙΚΕΣ ΕΡΓΑΣΙΕΣ ΠΜΣ «ΠΛΗΡΟΦΟΡΙΚΗ & ΕΠΙΚΟΙΝΩΝΙΕΣ» OSWINDS RESEARCH GROUP

Reading Order Detection for Text Layout Excluded by Image

A Method for Creating Shortcut Links by Considering Popularity of Contents in Structured P2P Networks

An Automatic Modulation Classifier using a Frequency Discriminator for Intelligent Software Defined Radio

No. 7 Modular Machine Tool & Automatic Manufacturing Technique. Jul TH166 TG659 A

Automatic extraction of bibliography with machine learning

Stabilization of stock price prediction by cross entropy optimization

J. of Math. (PRC) Banach, , X = N(T ) R(T + ), Y = R(T ) N(T + ). Vol. 37 ( 2017 ) No. 5

ΕΥΡΕΣΗ ΤΟΥ ΔΙΑΝΥΣΜΑΤΟΣ ΘΕΣΗΣ ΚΙΝΟΥΜΕΝΟΥ ΡΟΜΠΟΤ ΜΕ ΜΟΝΟΦΘΑΛΜΟ ΣΥΣΤΗΜΑ ΟΡΑΣΗΣ

{takasu, Conditional Random Field

Kernel Methods and their Application for Image Understanding

Shortness Ambiguity TEAM Ungrammaticality

y = f(x)+ffl x 2.2 x 2X f(x) x x p T (x) = 1 Z T exp( f(x)=t ) (2) x 1 exp Z T Z T = X x2x exp( f(x)=t ) (3) Z T T > 0 T 0 x p T (x) x f(x) (MAP = Max

Ερευνητική+Ομάδα+Τεχνολογιών+ Διαδικτύου+

(Υπογραϕή) (Υπογραϕή) (Υπογραϕή)

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ

Εξόρυξη Γνώσης από εδοµένα (Data Mining) Συσταδοποίηση

Κβαντική Επεξεργασία Πληροφορίας

n 1 n 3 choice node (shelf) choice node (rough group) choice node (representative candidate)

Web. Web p OutDegree(p) log 7 1/OutDegree(p) A New Difinition of Subjective Distance between Web Pages

Ανάκτηση Εικόνας βάσει Υφής με χρήση Eye Tracker

Architecture for Visualization Using Teacher Information based on SOM

J. of Math. (PRC) 6 n (nt ) + n V = 0, (1.1) n t + div. div(n T ) = n τ (T L(x) T ), (1.2) n)xx (nt ) x + nv x = J 0, (1.4) n. 6 n

MBR Ελάχιστο Περιβάλλον Ορθογώνιο (Minimum Bounding Rectangle) Το µικρότερο ορθογώνιο που περιβάλλει πλήρως το αντικείµενο 7 Παραδείγµατα MBR 8 6.

Fourier transform, STFT 5. Continuous wavelet transform, CWT STFT STFT STFT STFT [1] CWT CWT CWT STFT [2 5] CWT STFT STFT CWT CWT. Griffin [8] CWT CWT

ΕΙΣΑΓΩΓΗ. 1.1 Εισαγωγή

Toward a SPARQL Query Execution Mechanism using Dynamic Mapping Adaptation -A Preliminary Report- Takuya Adachi 1 Naoki Fukuta 2.

Research of Han Character Internal Codes Recognition Algorithm in the Multi2lingual Environment

Research on real-time inverse kinematics algorithms for 6R robots

IPSJ SIG Technical Report Vol.2014-CE-127 No /12/6 CS Activity 1,a) CS Computer Science Activity Activity Actvity Activity Dining Eight-He

EM Baum-Welch. Step by Step the Baum-Welch Algorithm and its Application 2. HMM Baum-Welch. Baum-Welch. Baum-Welch Baum-Welch.

Wavelet based matrix compression for boundary integral equations on complex geometries

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ ΣΧΟΛΗ ΗΛΕΚΤΡΟΛΟΓΩΝ ΜΗΧΑΝΙΚΩΝ ΚΑΙ ΜΗΧΑΝΙΚΩΝ ΥΠΟΛΟΓΙΣΤΩΝ

Customized Pricing Recommender System Simple Implementation and Preliminary Experiments

ΜΕΘΟΔΟΙ ΥΠΟΛΟΓΙΣΤΙΚΗΣ ΝΟΗΜΟΣΥΝΗΣ ΓΙΑ ΤΗΝ ΠΡΟΒΛΕΨΗ ΧΡΟΝΟΛΟΓΙΚΩΝ ΣΕΙΡΩΝ

SVM. Research on ERPs feature extraction and classification

From Secure e-computing to Trusted u-computing. Dimitris Gritzalis

Detection and Recognition of Traffic Signal Using Machine Learning

Ανάλυση σχημάτων βασισμένη σε μεθόδους αναζήτησης ομοιότητας υποακολουθιών (C589)

Comparison of Discriminant Analysis in Ear Recognition

Bayesian statistics. DS GA 1002 Probability and Statistics for Data Science.

Simplex Crossover for Real-coded Genetic Algolithms

SocialDict. A reading support tool with prediction capability and its extension to readability measurement

ΠΑΝΕΠΙΣΤΗΜΙΟ ΜΑΚΕΔΟΝΙΑΣ

ΕΞΟΡΥΞΗ ΔΕΔΟΜΕΝΩΝ. Διαδικαστικά

476,,. : 4. 7, MML. 4 6,.,. : ; Wishart ; MML Wishart ; CEM 2 ; ;,. 2. EM 2.1 Y = Y 1,, Y d T d, y = y 1,, y d T Y. k : p(y θ) = k α m p(y θ m ), (2.1

Τοποθέτηση τοπωνυµίων και άλλων στοιχείων ονοµατολογίας στους χάρτες

Εφαρμογές σε Χωρικά Δίκτυα

ΤΕΧΝΟΛΟΓΙΚΟ ΕΚΠΑΙΔΕΥΤΙΚΟ ΙΔΡΥΜΑ ΑΓΡΟΤΙΚΕΣ ΣΤΑΤΙΣΤΙΚΕΣ ΜΕ ΕΡΓΑΛΕΙΑ ΓΕΩΠΛΗΡΟΦΟΡΙΚΗΣ

Schedulability Analysis Algorithm for Timing Constraint Workflow Models

Applying Markov Decision Processes to Role-playing Game

Transcript:

170 18 2 2003 3 2 A Survey of Recent Clustering Methods for Data Mining part 2) Challenges to Conquer Giga Data Sets and The Curse of Dimensionality Toshihiro Kamishima National Institue of Advanced Industrial Science and Technology AIST) mail@kamishima.net, http://www.kamishima.net/ keywords: Clustering, Unsupvervised Learning, Survey, Data Mining 1 1 8. ON 2 ) k-means ONk) ON) 1 CLARANS Ng CLARANS Clustering Large Applications based on RANdomized Search) [Ng 94] CLARANS PAM CLARA PAM Partition Around Medoids k-means medoid medoid medoid medoid 1 x i : object N : X={x 1,...,x N } : d : x i=x i1,...,x id ) : x i D 1,...,D d : k : C 1,...,C k : n i : C i c i : C i Dx i,x j ) : x i x j k-means k medoid medoid medoid PAM ON 2 k) CLARA Clustering LARge Applications) N 40 + 2k PAM medoid medoid CLARA Ok 3 + Nk) CLARANS CLARA medoid medoid kn k)/80 250 CLARANS CLARA 2 BIRCH Zhang BIRCH Balanced Iterative Reducing and Clustering using Hierarchies) [Zhang 96, Zhang 97] ON) Clustering Feature Tree CF-tree CF-tree CF-tree N ON) CF-tree Clustering Feature CF

2 171 Phase1: Phase2option): Density Phase3: Phase4option): a) =ξ b) ξ 2 DENCLUE [Hinneburg 98] 1 BIRCH [Zhang 97] B L T B CF L CF B L T CF C xi,xj C,xi xj x i x j ) x i x j ) nn 1) CF-tree T N CF 2 CF=n, x i, x i x i ) x i C x i C CF C i C j CF C i C j CF Zhang CF D0 D4 D2C i,c j ) = x i C i x j C j x i x j ) x i x j ) n i n j CF BIRCH CF-tree 1 Phase1 4 1 3 2 4 Phase1 BIRCH CF-tree B+-tree CF-tree D0 D4 CF T CF CF CF L B CF-tree T T CF-tree Phase2 CF-tree Phase CF CF Phase3 D0 D4 D2 D4 Phase4 Phase3 2 JOIN I/O 3 L L d L L L d N ON) Hinneburg DENCLUE DENsity-based CLUstEring) [Hinneburg 98] 2a)

172 18 2 2003 3 ξ 2b) 2σ G p ξ/2d G sp 4σ g G sp G p G r G sp G p N x g g 1 G r x 4σ g 1 nearx) x x nearx 1) exp Dx,x1 ) ) ) 2 2σ 2 BIRCH ξ σ Xu DBCLASD Distribution Based Clustering of LArge Spatial Databases [Xu 98] C x C min xi C,x x i Dx,x i ) seed Wavelet Sheikholeslami WaveCluster [Sheikholeslami 98] Wang STINGSTatistical INformation Grid-based method) [Wang 97] 4 VFKM CF-tree CLARANS Domingos VFKM [Domingos 01] CLARANS k-means k-means C c i N C N c N i Yi Yj H 3 x a x b [Faloutsos 95] xi G xi' F xb E xa xj xj' 4 x a x b H [Faloutsos 95] LC N,C )= k Dc N i,c i ) i ϵ δ Pr[LC N,C ) ϵ] δ k-means N k- means k-means 3 9. 1 FastMap 4 ON 2 ) N

2 173 5 ξ τ i A i 4 A 1 <8) 2 A 2 <5)) 2 A 1 <6) 4 A 2 <8)) 5 CLIQUE [Agrawal 98] Faloutsos FastMap 1 [Faloutsos 95] ON) 1 3 2 x a x b x a x b x i x i Y i 3 x i x a x b E x a E Y i = Dxa,x i ) ) 2 + Dxa,x b ) ) 2 Dxb,x i ) ) 2 2Dx a,x b ) x i 1 2 4 x a x b d 1 H H x a x b N 1 d 1 H H x i x j H D x i,x j ) x i x j G D x i,x j ) ) 2 = Dxi,x j ) ) 2 Y i Y j) 2 FastMap HyperMap [ 02] Fisher [ 02] 2 CLIQUE Apriori [Agrawal 94] Agrawal Apriori CLIQUE CLustering In QUEst [Agrawal 98] CLIQUE 1 http://www.cs.cmu.edu/~christos/software.html 5 2 2 A 2 A 1 C D CLIQUE Apriori d d 1 1 d K OK d + N d) d d 3 ORCLUS Aggarwal ORCLUS arbitrarily ORiented projected CLUSter generation [Aggarwal 00] ORCLUS d β α 8 BIRCH CF k 0 Ok 3 0 + k 0 Nd + k 2 0d 3 ) k 0 10. k-means 6

174 18 2 2003 3 6 [Ester 96] xq xp xo 7 DBSCAN[Ester 96] 2 3 1 DBSCAN Ester DBSCAN [Ester 96] Web http://www.dbs.informatik.uni-muenchen.de /Forschung/KDD/Clustering/ DBSCAN Eps MinPts 7 directly densityreachable DDR) x p x q DDR xr 1) x q N Eps x p ) 2) N Eps x p ) MinPts N Eps x p ) = {x q X Dx p,x q ) Eps} 7 x q x p DDR 2 DDR x o x r x o x q x p DDR seed DDR DDR seed 6 MinPts R -tree ON log N) DBSCAN GDBSCAN [Sander 98] Eps OPTICS [Ankerst 00] 2 CURE Guha CURE [Guha 98] k-means α k DBSCAN CURE ON 2 log N) 2 ON 2 ) 3 Zahn [Zahn 71] minimum spanning tree Gabriel relative neighborhood Urquhart [Urquhart 82] k Mizoguchi [Mizoguchi 80] Karypis [Karypis 99] Harel [Harel 01] [Barbará 00] 11. 1 Learning from Cluster Examples; LCE) [Kamishima 03a, 03b] LCE 8

2 175 O1,FO1)) π * 1 O2,FO2)) π * 2 O#EX,FO#EX)) π * #EX 8 OU,FOU)) π^ U WWW http://www.cs.cornell.edu /home/wkiri/cop-kmeans/ 3 [Bensaid 96, Nigam 98, Goldman 00, 01] LCE LCE LCE-MAP LCE-Maximum A Posteriori 3 Ax i ): N Ap ij ): x i x j p ij NN 1)/2 Aπ): Pr[π=π,Aπ),{Ax)},{Ap)}] π=π π π Pr[Aπ) π=π ] Pr[π=π {Ax)},{Ap)}] 2 COP-KMEANS Wagstaff COBWEB [Wagstaff 00] k-means COP-KMEANS [Wagstaff 01] k-means COP- KMEANS 12. 1 2 [Barbará 02] [Ramoni 02, Keogh 98, 02] [Taskar 01] [Ding 01] 11 [Cohn 00] [Aggarwal 00] Aggarwal, C. C. and Yu, P. S.: Finding Generalized Projected Clusters in High Dimensional Spaces, in Proc of the ACM SIGMOD Int l Conf. on Management of Data, pp. 70 81 2000) [Agrawal 94] Agrawal, R. and Srikant, R.: Fast Algorithms for Mining Association Rules, in Proc. of the 20th Very Large Database Conf., pp. 487 499 1994)

176 18 2 2003 3 [Agrawal 98] Agrawal, R., Gehrke, J., Gunopulos, D., and Raghavan, P.: Automatic Subspaace Clustering of High Dimensional Data for Data Mining Applicatiosn, in Proc. of the ACM SIGMOD Int l Conf. on Management of Data, pp. 94 105 1998) [ 02],,,,, HyperMap:, 13 DEWS2002), C2 10 2002) [Ankerst 00] Ankerst, M., Breunig, M. M., and Sander, H.-P. K. J.: OPTICS: Ordering Points To Identify the Clustering Structure, in Proc of the ACM SIGMOD Int l Conf. on Management of Data, pp. 49 60 2000) [Barbará 00] Barbará, D. and Chen, P.: Using the Fractal Dimension to Cluster Datasets, in Proc. of The 6th Int l Conf. on Knowledge Discovery and Data Mining, pp. 260 264 2000) [Barbará 02] Barbará, D.: Requirements for Clustering Data Streams, SIGKDD Explorations, Vol. 3, No. 2, pp. 23 27 2002) [Bensaid 96] Bensaid, A. M., Hall, L. O., Bezdek, J. C., and Clarke, L. P.: Partially Supervised Clustering for Image Segmentation, Pattern Recognition, Vol. 29, No. 5, pp. 859 871 1996) [Cohn 00] Cohn, D., Caruana, R., and McCallum, A.: Semisupervised Clustering with User Feedback, http://www- 2.cs.cmu.edu/ mccallum/papers/semiup-aaai2000s.ps.gz 2000) [Ding 01] Ding, C. H. Q., He, X., Zha, H., Gu, M., and Simon, H. D.: A Min-max Cut Algorithm for Graph Partitioning and Data Clustering, in Proc. of the IEEE Int l Conf. on Data Mining, pp. 107 114 2001) [Domingos 01] Domingos, P. and Hulten, G.: A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering, in Proc. of the 18th Int l Conf. on Machine Learning, pp. 106 113 2001) [Ester 96] Ester, M., Kriegel, H.-P., Sander, J., and Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise, in Proc. of The 2nd Int l Conf. on Knowledge Discovery and Data Mining, pp. 226 231 1996) [Faloutsos 95] Faloutsos, C. and Lin, K.-I.: FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets, in Proc. of the ACM SIGMOD Int l Conf. on Management of Data, pp. 163 174 1995) [Goldman 00] Goldman, S. and Zhou, Y.: Enhancing Supervised learning with Unlabeled Data, in Proc. of the 17th Int l Conf. on Machine Learning, pp. 327 334 2000) [Guha 98] Guha, S., Rastogi, R., and Shim, K.: CURE: An Efficient Clustering Algorithm for Large Databases, in Proc. of the ACM SIGMOD Int l Conf. on Management of Data, pp. 73 80 1998) [Harel 01] Harel, D. and Koren, Y.: Clustering Spatial Data Using Random Walks, in Proc. of The 7th Int l Conf. on Knowledge Discovery and Data Mining, pp. 281 286 2001) [Hinneburg 98] Hinneburg, A. and Keim, D. A.: An Efficient Approach to Clustering in Large Multimedia Databases with Noise, in Proc. of The 4th Int l Conf. on Knowledge Discovery and Data Mining, pp. 58 65 1998) [Kamishima 03a] Kamishima, T. and Motoyoshi, F.: Learning from Cluster Examples, Machine Learning 2003), in press) [ 03b],,,, Vol. 18, No. 2, pp. 86 95 2003) [Karypis 99] Karypis, G., E.-H.Han,, and Kumar, V.: Chameleon: Hierarchical Clustering Using Dynamic Modeling, IEEE Computer, Vol. 32, No. 8, pp. 68 75 1999) [Keogh 98] Keogh, E. J. and Pazzani, M. J.: An Enhanced Representation of Time Series Which Allows Fast And Accurate Classification, Clustering and Relevance Feedback, in Proc. of The 4th Int l Conf. on Knowledge Discovery and Data Mining, pp. 239 243 1998) [Mizoguchi 80] Mizoguchi, R. and Shimura, M.: A Nonparametric Algorithm for Detecting Clusters Using Hierarchical Structure, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 2, No. 4, pp. 292 300 1980) [Ng 94] Ng, R. T. and Han, J.: Efficient and Effective Clustering Methods for Spatial Data Mining, in Proc. of the 20th Very Large Database Conf., pp. 144 155 1994) [Nigam 98] Nigam, K., McCallum, A., Thrun, S., and Mitchell, T.: Learning to Classify Text from Labeled and Unlabeled Documents, in Proc. of the 15th National Conf. on Artificial Intelligence, pp. 792 799 1998) [Ramoni 02] Ramoni, M., Sebastiani, P., and Cohen, P.: Bayesian Clustering by Dynamics, Machine Learning, Vol. 47, pp. 91 121 2002) [Sander 98] Sander, J., Ester, M., Kriegel, H.-P., and Xu, X.: Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications, Journal of Data Mining and Knowledge Discovery, Vol. 2, pp. 169 194 1998) [Sheikholeslami 98] Sheikholeslami, G., Chatterjee, S., and Zhang, A.: WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases, in Proc. of the 24th Very Large Database Conf., pp. 428 439 1998) [ 02],,, D-II, Vol. J85-D-II, No. 5, pp. 785 795 2002) [Taskar 01] Taskar, B., Segal, E., and Koller, D.: Probabilistic Classification and Clustering in Relational Data, in Proc. of the 17th Int l Joint Conf. on Artificial Intelligence, pp. 870 876 2001) [ 01], 4, pp. 89 94 2001) [Urquhart 82] Urquhart, R.: Graph Theoretical Clustering Based on Limited Neghbourhood Sets, Pattern Recognition, Vol. 15, No. 3, pp. 173 187 1982) [Wagstaff 00] Wagstaff, K. and Cardie, C.: Clustering with Instance-level Constraints, in Proc. of the 17th Int l Conf. on Machine Learning, pp. 1103 1110 2000) [Wagstaff 01] Wagstaff, K., Cardie, C., Rogers, S., and Schroedl, S.: Constrained K-means Clustering with Background Knowledge, in Proc. of the 18th Int l Conf. on Machine Learning, pp. 577 584 2001) [Wang 97] Wang, W., Yang, J., and Muntz, R.: STING: A Statistical Information Grid Approach to Spatial Data Mining, in Proc. of the 23rd Very Large Database Conf., pp. 186 195 1997) [Xu 98] Xu, X., Ester, M., Kriegel, H.-P., and Sander, J.: A Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases, in Proc. of the 14th Int l Conf. on Data Engineering, pp. 324 331 1998) [ 02],,, 1 2002), LG 3 [Zahn 71] Zahn, C. T.: Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters, IEEE Trans. on Computers, Vol. 20, No. 1, pp. 68 86 1971) [Zhang 96] Zhang, T., Ramakrishnan, R., and Livny, M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases, in Proc. of the ACM SIGMOD Int l Conf. on Management of Data, pp. 103 114 1996) [Zhang 97] Zhang, T., Ramakrishnan, R., and Livny, M.: BIRCH: A New Data Clustering Algorithm and Its Applications, Journal of Data Mining and Knowledge Discovery, Vol. 1, pp. 141 182 1997)

2 177 Vol.18 No.1 p.65