,,, Learning to Identif y Chinese Comparative Sentences



Σχετικά έγγραφα
Quick algorithm f or computing core attribute

Nov Journal of Zhengzhou University Engineering Science Vol. 36 No FCM. A doi /j. issn

ER-Tree (Extended R*-Tree)

Reading Order Detection for Text Layout Excluded by Image


Automatic extraction of bibliography with machine learning

[4] 1.2 [5] Bayesian Approach min-max min-max [6] UCB(Upper Confidence Bound ) UCT [7] [1] ( ) Amazons[8] Lines of Action(LOA)[4] Winands [4] 1

The Algorithm to Extract Characteristic Chord Progression Extended the Sequential Pattern Mining

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ ΣΧΟΛΗ ΗΛΕΚΤΡΟΛΟΓΩΝ ΜΗΧΑΝΙΚΩΝ ΚΑΙ ΜΗΧΑΝΙΚΩΝ ΥΠΟΛΟΓΙΣΤΩΝ


Automatic Domain2Specific Term Extraction and Its Application in Text Cla ssification

2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems

ΣΔΥΝΟΛΟΓΗΚΟ ΔΚΠΑΗΓΔΤΣΗΚΟ ΗΓΡΤΜΑ ΗΟΝΗΧΝ ΝΖΧΝ «ΗΣΟΔΛΗΓΔ ΠΟΛΗΣΗΚΖ ΔΠΗΚΟΗΝΧΝΗΑ:ΜΔΛΔΣΖ ΚΑΣΑΚΔΤΖ ΔΡΓΑΛΔΗΟΤ ΑΞΗΟΛΟΓΖΖ» ΠΣΤΥΗΑΚΖ ΔΡΓΑΗΑ ΔΤΑΓΓΔΛΗΑ ΣΔΓΟΤ

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ Η/Υ & ΠΛΗΡΟΦΟΡΙΚΗΣ. του Γεράσιμου Τουλιάτου ΑΜ: 697

Research of Han Character Internal Codes Recognition Algorithm in the Multi2lingual Environment

User Behavior Analysis for a Large2scale Search Engine

Detection and Recognition of Traffic Signal Using Machine Learning

Η αλληλεπίδραση ανάμεσα στην καθημερινή γλώσσα και την επιστημονική ορολογία: παράδειγμα από το πεδίο της Κοσμολογίας

( ) , ) , ; kg 1) 80 % kg. Vol. 28,No. 1 Jan.,2006 RESOURCES SCIENCE : (2006) ,2 ,,,, ; ;

Text Mining using Linguistic Information


Buried Markov Model Pairwise

-,,.. Fosnot. Tobbins Tippins -, -.,, -,., -., -,, -,.

J. of Math. (PRC) Banach, , X = N(T ) R(T + ), Y = R(T ) N(T + ). Vol. 37 ( 2017 ) No. 5

Study of urban housing development projects: The general planning of Alexandria City

SVM. Research on ERPs feature extraction and classification

Schedulability Analysis Algorithm for Timing Constraint Workflow Models

A research on the influence of dummy activity on float in an AOA network and its amendments

Test Data Management in Practice

Η ΠΡΟΣΩΠΙΚΗ ΟΡΙΟΘΕΤΗΣΗ ΤΟΥ ΧΩΡΟΥ Η ΠΕΡΙΠΤΩΣΗ ΤΩΝ CHAT ROOMS

Area Location and Recognition of Video Text Based on Depth Learning Method

Optimization Investment of Football Lottery Game Online Combinatorial Optimization

Ερευνητική+Ομάδα+Τεχνολογιών+ Διαδικτύου+

ΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ. ΘΕΜΑ: «ιερεύνηση της σχέσης µεταξύ φωνηµικής επίγνωσης και ορθογραφικής δεξιότητας σε παιδιά προσχολικής ηλικίας»

The State of the Art and Difficulties in Automatic Chinese Word Segmentation

A Method for Creating Shortcut Links by Considering Popularity of Contents in Structured P2P Networks

Χρηματοοικονομική Ανάπτυξη, Θεσμοί και

Medium Data on Big Data

ΔΙΠΛΩΜΑΤΙΚΕΣ ΕΡΓΑΣΙΕΣ

þÿ Ç»¹º ³µÃ ± : Ãż²» Ä Â

Πανεπιστήμιο Πειραιώς Τμήμα Πληροφορικής Πρόγραμμα Μεταπτυχιακών Σπουδών «Πληροφορική»

ΣΥΓΚΡΙΤΙΚΗ ΜΕΛΕΤΗ ΤΩΝ ΕΚΘΕΣΕΩΝ ΕΤΑΙΡΙΚΗΣ ΚΟΙΝΩΝΙΚΗΣ ΕΥΘΥΝΗΣ COSMOTE ΚΑΙ VODAFONE ΣΤΟΝ ΕΛΛΗΝΙΚΟ ΚΛΑΔΟ ΤΩΝ ΤΗΛΕΠΙΚΟΙΝΩΝΙΩΝ

Adaptive grouping difference variation wolf pack algorithm

Congruence Classes of Invertible Matrices of Order 3 over F 2

Αλγοριθµική και νοηµατική µάθηση της χηµείας: η περίπτωση των πανελλαδικών εξετάσεων γενικής παιδείας 1999

Application of a novel immune network learn ing algorithm to fault diagnosis

Overview. Transition Semantics. Configurations and the transition relation. Executions and computation

Web 論 文. Performance Evaluation and Renewal of Department s Official Web Site. Akira TAKAHASHI and Kenji KAMIMURA

Μιχαήλ Νικητάκης 1, Ανέστης Σίτας 2, Γιώργος Παπαδουράκης Ph.D 1, Θοδωρής Πιτηκάρης 3

ΓΕΩΠΟΝΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΑΘΗΝΩΝ ΤΜΗΜΑ ΕΠΙΣΤΗΜΗΣ ΤΡΟΦΙΜΩΝ ΚΑΙ ΔΙΑΤΡΟΦΗΣ ΤΟΥ ΑΝΘΡΩΠΟΥ

SocialDict. A reading support tool with prediction capability and its extension to readability measurement

ΕΤΑΙΡΙΚΗ ΚΟΙΝΩΝΙΚΗ ΕΥΘΥΝΗ ΣΤΗΝ ΝΑΥΤΙΛΙΑΚΗ ΒΙΟΜΗΧΑΜΙΑ

Topic Structure Mining based on Wikipedia and Web Search

Wiki. Wiki. Analysis of user activity of closed Wiki used by small groups

An Automatic Modulation Classifier using a Frequency Discriminator for Intelligent Software Defined Radio

ΚΕΙΜΕΝΟΚΕΝΤΡΙΚΗ ΘΕΩΡΙΑ: ΘΕΩΡΗΤΙΚΟ ΠΛΑΙΣΙΟ ΚΑΙ ΠΕΙΡΑΜΑΤΙΚΗ ΕΦΑΡΜΟΓΗ ΣΕ ΣΠΠΕ ΜΕ ΣΤΟΧΟ ΤΟΝ ΠΕΡΙΒΑΛΛΟΝΤΙΚΟ ΓΡΑΜΜΑΤΙΣΜΟ ΤΩΝ ΜΑΘΗΤΩΝ

(Υπογραϕή) (Υπογραϕή) (Υπογραϕή)

No. 7 Modular Machine Tool & Automatic Manufacturing Technique. Jul TH166 TG659 A

ΤΕΧΝΟΛΟΓΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΚΥΠΡΟΥ ΣΧΟΛΗ ΕΠΙΣΤΗΜΩΝ ΥΓΕΙΑΣ

3: A convolution-pooling layer in PS-CNN 1: Partially Shared Deep Neural Network 2.2 Partially Shared Convolutional Neural Network 2: A hidden layer o

Δυσκολίες που συναντούν οι μαθητές της Στ Δημοτικού στην κατανόηση της λειτουργίας του Συγκεντρωτικού Φακού

Vol. 31,No JOURNAL OF CHINA UNIVERSITY OF SCIENCE AND TECHNOLOGY Feb

IL - 13 /IL - 18 ELISA PCR RT - PCR. IL - 13 IL - 18 mrna. 13 IL - 18 mrna IL - 13 /IL Th1 /Th2

Re-Pair n. Re-Pair. Re-Pair. Re-Pair. Re-Pair. (Re-Merge) Re-Merge. Sekine [4, 5, 8] (highly repetitive text) [2] Re-Pair. Blocked-Repair-VF [7]

CorV CVAC. CorV TU317. 1

ΑΓΓΛΙΚΑ Ι. Ενότητα 7α: Impact of the Internet on Economic Education. Ζωή Κανταρίδου Τμήμα Εφαρμοσμένης Πληροφορικής

Τo ελληνικό τραπεζικό σύστημα σε περιόδους οικονομικής κρίσης και τα προσφερόμενα προϊόντα του στην κοινωνία.

Toward a SPARQL Query Execution Mechanism using Dynamic Mapping Adaptation -A Preliminary Report- Takuya Adachi 1 Naoki Fukuta 2.

Lewis Acid Catalyzed Propargylation of Arenes with O-Propargyl Trichloroacetimidate: Synthesis of 1,3-Diarylpropynes

ΔΗΜΟΚΡΙΤΕΙΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΡΑΚΗΣ ΣΧΟΛΗ ΕΠΙΣΤΗΜΩΝ ΑΓΩΓΗΣ

Homomorphism in Intuitionistic Fuzzy Automata

ΓΕΩΠΟΝΙΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΑΘΗΝΩΝ ΤΜΗΜΑ ΑΓΡΟΤΙΚΗΣ ΟΙΚΟΝΟΜΙΑΣ & ΑΝΑΠΤΥΞΗΣ

IPSJ SIG Technical Report Vol.2014-CE-127 No /12/6 CS Activity 1,a) CS Computer Science Activity Activity Actvity Activity Dining Eight-He

Εξαγωγή ζευγών ερώτησης απάντησης από forum και αυτόματη απάντηση νέων ερωτήσεων

Anomaly Detection with Neighborhood Preservation Principle

1530 ( ) 2014,54(12),, E (, 1, X ) [4],,, α, T α, β,, T β, c, P(T β 1 T α,α, β,c) 1 1,,X X F, X E F X E X F X F E X E 1 [1-2] , 2 : X X 1 X 2 ;

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ ΣΧΟΛΗ ΠΟΛΙΤΙΚΩΝ ΜΗΧΑΝΙΚΩΝ. «Θεσμικό Πλαίσιο Φωτοβολταïκών Συστημάτων- Βέλτιστη Απόδοση Μέσω Τρόπων Στήριξης»

Πανεπιστήµιο Πειραιώς Τµήµα Πληροφορικής

IF(Ingerchange Format) [7] IF C-STAR(Consortium for speech translation advanced research ) [8] IF 2 IF

ΖΩΝΟΠΟΙΗΣΗ ΤΗΣ ΚΑΤΟΛΙΣΘΗΤΙΚΗΣ ΕΠΙΚΙΝΔΥΝΟΤΗΤΑΣ ΣΤΟ ΟΡΟΣ ΠΗΛΙΟ ΜΕ ΤΗ ΣΥΜΒΟΛΗ ΔΕΔΟΜΕΝΩΝ ΣΥΜΒΟΛΟΜΕΤΡΙΑΣ ΜΟΝΙΜΩΝ ΣΚΕΔΑΣΤΩΝ

Πανεπιστήµιο Μακεδονίας Τµήµα ιεθνών Ευρωπαϊκών Σπουδών Πρόγραµµα Μεταπτυχιακών Σπουδών στις Ευρωπαϊκές Πολιτικές της Νεολαίας

Homomorphism of Intuitionistic Fuzzy Groups

Solutions to the Schrodinger equation atomic orbitals. Ψ 1 s Ψ 2 s Ψ 2 px Ψ 2 py Ψ 2 pz

ΠΟΛΥΤΕΧΝΕΙΟ ΚΡΗΤΗΣ ΣΧΟΛΗ ΜΗΧΑΝΙΚΩΝ ΠΕΡΙΒΑΛΛΟΝΤΟΣ

ΕΥΘΑΛΙΑ ΚΑΜΠΟΥΡΟΠΟΥΛΟΥ

HIV HIV HIV HIV AIDS 3 :.1 /-,**1 +332

Development of the Nursing Program for Rehabilitation of Woman Diagnosed with Breast Cancer

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

ST5224: Advanced Statistical Theory II

Twitter 6. DEIM Forum 2014 A Twitter,,, Wikipedia, Explicit Semantic Analysis,

Δθαξκνζκέλα καζεκαηηθά δίθηπα: ε πεξίπησζε ηνπ ζπζηεκηθνύ θηλδύλνπ ζε κηθξνεπίπεδν.

Stabilization of stock price prediction by cross entropy optimization

Study on the Strengthen Method of Masonry Structure by Steel Truss for Collapse Prevention

Electronic Supplementary Information

Optimization Investment of Football Lottery Game Online Combinatorial Optimization

MUL TIL EVEL2USER2ORIENTED AGRICUL TURAL INFORMATION CLASSIFICATION

n 1 n 3 choice node (shelf) choice node (rough group) choice node (representative candidate)

ΕΘΝΙΚΗ ΣΧΟΛΗ ΔΗΜΟΣΙΑΣ ΔΙΟΙΚΗΣΗΣ ΙΓ' ΕΚΠΑΙΔΕΥΤΙΚΗ ΣΕΙΡΑ

Computational study of the structure, UV-vis absorption spectra and conductivity of biphenylene-based polymers and their boron nitride analogues

Transcript:

22 5 2008 9 J OU RNAL OF CH IN ESE IN FORMA TION PROCESSIN G Vol. 22, No. 5 Sep., 2008 : 100320077 (2008) 0520030209,,, (,100871) :,,,,, SVM, : SVM, : ;;; ; ; : TP391 : A Learning to Identif y Chinese Comparative Sentences HUAN G Xiao2jiang, WAN Xiao2jun, YAN G Jian2wu, XIAO Jian2guo ( Institute of Computer Science and Technology of Peking University, Beijing 100871, China) Abstract : Comparison is a common kind of expression, and it is novel and substantial research to extract comparative relations between object s. Identifying comparative sentences in natural language is an important step in extracting comparative relations. To our knowledge, there is no research on identifying Chinese comparative sentences automatically. This paper first defines the problem of Chinese comparative sentence identification, and then proposes to use SVM to classify a Chinese sentence into either comparative or not. Various linguistic and statistical features have been explored, such as keywords and sequential patterns. Experimental result s demonstrate the effectiveness of the sequential patterns, i. e. the classifier with sequential patterns can significantly outperform the traditional term2 based classifier. We also empirically investigate the important factors that affect classification performance. Key words : comp uter application ; Chinese information p rocessing ; Chinese comparative sentences identification ; comparative mining ; text classification ; sequential pattern 1,,, ;,;,, Jindal [ 1 ], [2 ] Zhai Cross2 Collection Mixture Model [3, 4 ] ; Sun [ 5 ] L uo [6 ] Web : 2008204203 : 2008206227 : 863 (2008AA01Z421) ; (60703064) ; (20070001059) : (1984 ),,, ;(1979 ),,,, ;(1973 ),,,, SGML/ XML

5 : 31, ; Feldman [7 ], [8 10 ] [11, 12 ] [13 ], Web,,,, SVM,,,,,, : 2, ; 3 ; 4 ; 2 2. 1, Lerner [14 ], Stassen [15 ], 1898,,, [10 ], () ( ) : :,,,,,,,,,, : 1,, 2. 2,,,,X Y R X Y RX / Y RX Y R, RY / RX R Y,X,,, R, Y / R X Y R, X R,,, X R, X, Y,R

32 2008, :,,, Y,X,,, [16 ] :,,, 2. 3 2. 3. 1 (), ( ),, ( ),, ( / / ), 2. 3. 2,,,,, : 10cm,,,,,, 2. 3. 3, [17 ], : 2. 3. 4, [12 ],X Y ( R) X R YX / Y ( R), [17 ] : + +,,,,,,, [13 ] 2. 3. 5,,,,,, 3 3. 1,: a) b),,f : S C,, S, C a, C = {, } ; b, C = {,,,, } a,,,, ;,, 3. 2 SVM ( Support Vector Machine, SVM)

5 : 33 Boser [ 18 ],,, w x + b = 0, 2/ w : D = xi, ci xi R p, ci - 1,1, w b : ci xi w + b - 1 0, Πi, SVM w b w 2, x, SVM : f ( x) = sgn (w x + b) = + 1 if w x + b > 0-1 ot herwise,svm SVM, [19 ] [20, 21 ] [22 ], 3. 3 2. 3, ( ), [12 ], A,,,, [1 ],,, SVM 3. 4,,, 3. 4. 1,,,, I = { i1, i2,, in}, X s, a1 a2 ar, ai, s s1 = a1 a2 ar s2 = b1 b2 bm, 1 j1 < j2 < < j r - 1 m,a1 Αbj1, a2 Α bj2,, ar Αbjr,s1 s2,s2s1, D, D = { ( s1, c1 ), ( s2, c2 ),, ( sn, cn) }, si, ci C (Class Sequential Rule, CSR) X c,x, c C D d = ( si, ci ) CSR : X c, X s i, d CSR ;d CSR, c = ci,d CSR ( Support) D ( Confidence) D 3. 4. 2,, Jindal ( ),,, CSR,, Jindal, ;,

34 2008 : / n / a 8848/ q / m, / v / p / n / n, 3 / a / p C, Jindal,3, 7 : / n / n / p / t 65/ m nm/ q / n / n 64/ m / q / n / d / a, same as, as as,,,, :,,,,, 3. 4. 3 CSR,CSR CSR,GSP [ 23 ] PrefixSpan [ 24 ] CSR PrefixSpan,,, [ 25 ] Jindal,,, : sup ( r) > min ( f i), f ir i, (0, 1),, min ( f i) < 1/ N ( N ),,: sup ( r) > max (min ( f i), s), s 1/ N, = 0. 1, s = 2/ N,0. 65 B 3. 4. 4 CSR R, s R s s R = { r1, r2,, rm},s f 1, f 2,, f m,, f i = 1 if s r i 0 ot herwise, 1 i m SVM,, (, ), C( sent) = C, if ϖseq ( seq S C( seq) = C NC, ot herwise ),, S sent, C, N C 4 4. 1,,, 2 : 2 1 297 458 4. 2 F 3, http :/ / groups. zol. com. cn

5 : 35 = = t p + t n t p + f p + t n + f n,= t p t p + f p, t p t p + f n, F = 2 t p 2 2 t p 2 + t p f p + t p f n 3 tp fp fn tn, 5,5, 4,1, 5, ( WS) (Cn,n ) ( SS) WS,, 4,, Cn SS, 1 4. 3 4. 3. 1, SVM SVM SVMLight, 4,Baseline, (Bag2of2words),; KW ; WP ; KWP ;CSR 4 F2 Baseline 90. 1 % 96. 7 % 64. 2 % 0. 772 KW 89. 9 % 91. 7 % 67. 5 % 0. 778 WP 90. 5 % 98. 7 % 64. 7 % 0. 781 KWP 91. 2 % 95. 7 % 69. 9 % 0. 806 CSR 92. 7 % 91. 4 % 79. 6 % 0. 850 1,Cn WS, 5,,, SS Cn WS 4. 3. 3,,,?,?,,SS 2,,, KW,, ( WP Baseline, KWP KW),CSR, F Baseline, CSR 23. 9 %, F2 10. 1 %, 5. 5 % 4. 3. 2 3. 4. 2, 2 http :/ / svmlight. joachims. org

36 2008,1,,,, 2 3,,,,F,,,3, 2 3 4. 3. 4 3 1. (),X Y R, CSR,,: a) 3/ u / d NC b) 3/ v / d NC c) 3/ n / d NC d) / d NC e) / d 3/ a C f) / d 3/ v NC g) / d 3/ n NC,e, () 2. X R Y,,SVM CSR, : a) 3/ a / p C b) 3/ v / p NC c) / p 3/ a NC d) / p 3/ n NC,a 3. ( ),CSR : a) 3/ m / p NC b) 3/ r / p NC c) 3/ a / p C d) 3/ v / p NC e) / p 3/ v NC f) / p 3/ n NC,, X R Y,R,,,, 5, SVM,,,,,,,,,,,,,,, : [1 ] N. J INDAL, B. L IU. Identifying Comparative

5 : 37 Sentences in Text Documents[ C ]/ / Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM : 2006 : 2442251. [2 ] N. J INDAL, B. L IU. Mining Comparative Sentences and Relations [ C ]/ / Proceedings of the 21st National Conference on Artificial Intelligence ( AAA I206 ). 2006. [3 ] C. ZHA I, A. V EL IV ELL I, B. YU. A Cross2 Collection Mixture Model for Comparative Text Mining[ C ]/ / Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM : 2004 : 7432748. [ 4 ] P. ZAN G, C. ZHA I. CTMS : a comparative text mining system[ D ]. Champaign : University of Illinois at Urbana2Champaign Computer Science Department, 2004. [5 ] J. 2T. SUN, X. WAN G, D. SH EN, H.2J. ZEN G, Z. CH EN. CWS : A Comparative Web Search System [ C]/ / Proceedings of the 15th International Conference on World Wide Web. ACM : 2006 : 4672476. [6 ] G. L UO, C. TAN G, Y.2L. TIAN. Answering relationship queries on the web [ C ]/ / Proceedings of the 16th international conference on World Wide Web. ACM : 2007 : 5612570. [7 ] R. FELDMAN, M. FRESKO, J. GOLDENBER G, O. N ETZER, L. UN GAR. Extracting Product Comparisons from Discussion Boards[ C]/ / Proceedings of the Seventh IEEE International Conference on Data Mining. 2007 : 4692474. [8 ]. [ M ]. :, 1898. [9 ]. [ M ]. :, 1942. [10 ]. [ M ]. :, 2007. [11 ]. [ M ]. :, 1980. [12 ]. [J ]., 2005, 25 (3) : 60263. [13 ]. [ M ]. :, 2004. [14 ] J. 2Y. L ERN ER, M. PIN KAL. Comparatives and Nested Quantifications [ M ]. Semantics : Critical Concept s in Linguistics. 2004 :70287. [15 ] L. STASSEN. Comparison and Universal Grammar [ M ]. Basil Blackwell, 1985. [16 ]. [ M ]. :, 1982. [17 ]. [ C ]/ /. 4. : 2004 : 12 21. [18 ] B. E. BOSER, I. M. GU YON, V. N. VA PNIK. A Training Algorithm for Optimal Margin Classifiers [ C ]/ / Proceedings of the fifth annual workshop on Computational learning theory. ACM : 1992 : 1442 152. [19 ] T. J OACHIMS. Text categorization with Support Vector Machines : Learning with many relevant features [ C ]/ / Proceedings of the ECML298, 10th European Conference on Machine Learning. Springer : 1998 : 1372142. [20 ],,. SVM [J ]., 2004, 18 (2) : 127. [21 ],. SVM [J ]., 2006, 20 (6) : 172 24. [22 ],,. [J ]., 2000, 14 (3) : 372 41. [23 ] R. SRIKAN T, R. A GRAWAL. Mining Sequential Patterns : Generalizations and Performance Improvement s [ C ]/ / Proceedings of the 5th International Conference on Extending Database Technology : Advances in Database Technology. Springer2Verlag : 1996 : 3217. [24 ] J. PEI, J. HAN, B. MORTAZAV I2ASL, J. WAN G, H. PIN TO, Q. CH EN, U. DA YAL, M.2 C. HSU. Mining Sequential Patterns by Pattern2 Growth : The PrefixSpan Approach [ J ]. IEEE Transactions on Knowledge and Data Engineering, 2004, 16. [25 ] B. L IU. Web Data Mining : Exploring Hyperlinks, Contents, and Usage Data[ M ]. Springer, 2006. A

38 2008 B ( ) / p / a C / a / a C / p / a C 3 / q / a C / p / a C / v 3 / a C / p / a C / v 3 / n C 3 / n / a C 3 / u / v C 3 / r / n C / p / n C / p / n C 3 / d / v C 3 / n / a C / p 3 / a C / p / a C / d / p C / d / n C 3 / n / p C 3 / nt / v C / p / v C 3 / n / v C / d / r C / p / v C / v 3 / nt C 3 / n / p C / v 3 / n C / p 3 / d C / p 3 / a C / v 3 / a C 3 / n / v C 3 / n / n NC / p 3 / d C / p / v C / v 3 / n C / p / v C / p 3 / a C / d 3 / a C / p / a C 3 / n / d C 3 / r / a C / d C / d 3 / a C 3 / a / p C 3 / v / p NC 1 :, 3 2 : :nst v a qr p dcu f eo i j z y nt nr ns nz m w 3 : C,NC (29 ) : [1 ],,,. () [ M ]. :,2003 2. [2 ],,. [J ].,2001, (3) : 21226. [3 ]. [ C ]/ /. :,2006 9 1,2272283. [4 ],,,,. : [J ].,13 (2) :1222158. [5 ]. [ M ]. :,1982 9. [6 ] Yu Jiangsheng, Jin Zhuihui, Wen Zhenshan. Automatic detection of collocation [ C ]/ / Hong Kong : Proceedings of the 4th Chinese Lexica Semantics Workshop, 2003. [7 ],,,. [J ].,2002,16 (5) : 49264, (6) :58265. [8 ],,,. [J ].,2004,18 (5) :1210. [9 ]. () [ M ]. :,2005 2. [10 ],,,. [ C ]/ /. :. 2005 4,2142221. [11 ],,. [ C]/ /. :,2005 : 70276. [12 ],. [ C ]/ /.,2006 8. [13 ],. [ M ]. :, 2001. [14 ]. [ M ]. :, 2001. [15 ]. [J ].,2007,21 (6) :3212.