Efficient Top-k Search for Random Walk with Restart

Σχετικά έγγραφα
Newman Modularity Newman [4], [5] Newman Q Q Q greedy algorithm[6] Newman Newman Q 1 Tabu Search[7] Newman Newman Newman Q Newman 1 2 Newman 3


Anomaly Detection with Neighborhood Preservation Principle


Kenta OKU and Fumio HATTORI

A Method for Creating Shortcut Links by Considering Popularity of Contents in Structured P2P Networks


Supplementary Materials for Evolutionary Multiobjective Optimization Based Multimodal Optimization: Fitness Landscape Approximation and Peak Detection

Web. Web p OutDegree(p) log 7 1/OutDegree(p) A New Difinition of Subjective Distance between Web Pages


Re-Pair n. Re-Pair. Re-Pair. Re-Pair. Re-Pair. (Re-Merge) Re-Merge. Sekine [4, 5, 8] (highly repetitive text) [2] Re-Pair. Blocked-Repair-VF [7]

HOSVD. Higher Order Data Classification Method with Autocorrelation Matrix Correcting on HOSVD. Junichi MORIGAKI and Kaoru KATAYAMA

GPGPU. Grover. On Large Scale Simulation of Grover s Algorithm by Using GPGPU

2. N-gram IDF. DEIM Forum 2016 A1-1. N-gram IDF IDF. 5 N-gram. N-gram. N-gram. N-gram IDF.

GPU. CUDA GPU GeForce GTX 580 GPU 2.67GHz Intel Core 2 Duo CPU E7300 CUDA. Parallelizing the Number Partitioning Problem for GPUs

An Efficient Calculation of Set Expansion using Zero-Suppressed Binary Decision Diagrams

MIDI [8] MIDI. [9] Hsu [1], [2] [10] Salamon [11] [5] Song [6] Sony, Minato, Tokyo , Japan a) b)

Ελληνικός οδηγός Εγκατάστασης Ethernet και ασυρμάτου Δικτφου

Λογισμικό - Προγράμματα

ΑΛΓΟΡΙΘΜΟΙ ΚΑΙ ΑΛΓΟΡΙΘΜΟΙ ΠΟΛΥΠΛΟΚΟΤΗΤΑ ΚΑΙ ΠΟΛΥΠΛΟΚΟΤΗΤΑ

Quick algorithm f or computing core attribute

Δομές Δεδομένων. Παύλος Εφραιμίδης

Δομημένος Προγραμματισμός ΙΙΙ - Java

[4] 1.2 [5] Bayesian Approach min-max min-max [6] UCB(Upper Confidence Bound ) UCT [7] [1] ( ) Amazons[8] Lines of Action(LOA)[4] Winands [4] 1

Bayesian modeling of inseparable space-time variation in disease risk


GridFTP-APT: Automatic Parallelism Tuning Mechanism for Data Transfer Protocol GridFTP

Τεχνολογίες Υλοποίησης Αλγορίθµων

Κεφάλαιο 3.1: Λειτουργικά Συστήματα. Επιστήμη ΗΥ Κεφ. 3.1 Καραμαούνας Πολύκαρπος


Research on real-time inverse kinematics algorithms for 6R robots

οµηµένος Προγραµµατισµός ΙΙΙ - Java Παύλος Εφραιµίδης οµηµένος Προγρ. ΙΙΙ - 1 Java Το Μάθηµα

οµηµένος Προγραµµατισµός ΙΙΙ - Java

ΕΙΣΑΓΩΓΗ ΣΤΗΝ ΕΝΝΟΙΑ CLIENT-SERVER COMPUTING

Πρόγραμμα Σεμιναρίων Σεπτεμβρίου - Δεκεμβρίου

Πρόγραμμα Σεμιναρίων Σεπτεμβρίου Δεκεμβρίου

Περιβάλλον Παράλληλου Προγραμματισμού

Toward a SPARQL Query Execution Mechanism using Dynamic Mapping Adaptation -A Preliminary Report- Takuya Adachi 1 Naoki Fukuta 2.

Exhaustive Topic Detection and Query Expansion Support Based on Substance-Oriented Term Clustering

Λειτουργικά Συστήματα Ι. Καθηγήτρια Παπαδάκη Αναστασία

Text Mining using Linguistic Information

Πρόγραμμα Σεμιναρίων Φεβρουαρίου - Ιουλίου

Buried Markov Model Pairwise

Bundle Adjustment for 3-D Reconstruction: Implementation and Evaluation

Gaze Estimation from Low Resolution Images Insensitive to Segmentation Error

ER-Tree (Extended R*-Tree)

Τεχνολογίες Υλοποίησης Αλγορίθµων

Πρόγραμμα Σεμιναρίων Σεπτεμβρίου - Δεκεμβρίου

Fourier transform, STFT 5. Continuous wavelet transform, CWT STFT STFT STFT STFT [1] CWT CWT CWT STFT [2 5] CWT STFT STFT CWT CWT. Griffin [8] CWT CWT

Κβαντική Επεξεργασία Πληροφορίας

No. 7 Modular Machine Tool & Automatic Manufacturing Technique. Jul TH166 TG659 A

Collaborative Filtering

Τεχνολογίες Υλοποίησης Αλγορίθµων

Πανεπιστήμιο Δυτικής Μακεδονίας. Τμήμα Μηχανικών Πληροφορικής & Τηλεπικοινωνιών. Τεχνητή Νοημοσύνη. Ενότητα 2: Αναζήτηση (Search)

Indexing Methods for Encrypted Vector Databases

Stabilization of stock price prediction by cross entropy optimization

Gemini, FastMap, Applications. Εαρινό Εξάμηνο Τμήμα Μηχανικών Η/Υ και Πληροϕορικής Πολυτεχνική Σχολή, Πανεπιστήμιο Πατρών

CCA. Simple CCA-Secure Public Key Encryption from Any Non-Malleable ID-based Encryption

Εφαρμογές Υπολογιστών. Κεφάλαιο 4 Λογισμικό Συστήματος

Πρόγραμμα Σεμιναρίων Φεβρουαρίου - Ιουλίου

ΓΡΑΜΜΙΚΟΣ & ΔΙΚΤΥΑΚΟΣ ΠΡΟΓΡΑΜΜΑΤΙΣΜΟΣ

Automatic generation of Network-on-Chip topology under link length and latency constraint

Προβολέας PE401H. Εγχειρίδιο χρήσης. Αρ. Μοντέλου NP-PE401H

Το µάθηµα Αντικείµενο-Περιεχόµενα µαθήµατος Τρόπος ιδασκαλίας Εργαστήριο Βιβλίο, Βιβλιογραφία On-line Υλικό 2

EM Baum-Welch. Step by Step the Baum-Welch Algorithm and its Application 2. HMM Baum-Welch. Baum-Welch. Baum-Welch Baum-Welch.

Twitter 6. DEIM Forum 2014 A Twitter,,, Wikipedia, Explicit Semantic Analysis,

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ Η/Υ & ΠΛΗΡΟΦΟΡΙΚΗΣ. του Γεράσιμου Τουλιάτου ΑΜ: 697

Orthogonalization Library with a Numerical Computation Policy Interface

Supporting Information

Λειτουργικά Συστήματα 1.1 Τι είναι Λειτουργικό Σύστημα (Operating System)

Binary32 (a hi ) 8 bits 23 bits Binary32 (a lo ) 8 bits 23 bits Double-Float (a=a hi +a lo, a lo 0.5ulp(a hi ) ) 8 bits 46 bits Binary64 11 bits sign

Αλγόριθμοι και πολυπλοκότητα Depth-First Search

Λογισμικό. Computers: Information Technology in Perspective By Long and Long Copyright 2002 Prentice Hall, Inc.

Evolutive Image Coding

Αλγόριθμοι και Πολυπλοκότητα

Αλγόριθμοι και Πολυπλοκότητα

ΠΛΗΡΟΦΟΡΙΚΉ. Μάθημα 6

User Behavior Analysis for a Large2scale Search Engine

Διοίκηση Εφοδιαστικής Αλυσίδας

ΨΗΦΙΑΚΟΣ ΓΡΑΜΜΑΤΙΣΜΟΣ

ΚΥΠΡΙΑΚΗ ΕΤΑΙΡΕΙΑ ΠΛΗΡΟΦΟΡΙΚΗΣ CYPRUS COMPUTER SOCIETY ΠΑΓΚΥΠΡΙΟΣ ΜΑΘΗΤΙΚΟΣ ΔΙΑΓΩΝΙΣΜΟΣ ΠΛΗΡΟΦΟΡΙΚΗΣ 24/3/2007

A research on the influence of dummy activity on float in an AOA network and its amendments

Automatic extraction of bibliography with machine learning

.,., Klas Eric Soderquist,!., (knowledge transfer). % " $&, " 295 " 72 " marketing 65,, ', (, (.

Κεφάλαιο 3. Γραφήµατα v1.0 ( ) Χρησιµοποιήθηκε υλικό από τις αγγλικές διαφάνειες του Kevin Wayne.

DETERMINATION OF DYNAMIC CHARACTERISTICS OF A 2DOF SYSTEM. by Zoran VARGA, Ms.C.E.

Research on model of early2warning of enterprise crisis based on entropy


Research on vehicle routing problem with stochastic demand and PSO2DP algorithm with Inver2over operator

[15], [16], [17] [6] [2] [5] Jiang [6] 2.1 [6], [10] Score(x, y) y ( 1) ( 1 ) b e ( 1 ) b e. O(n 2 ) Jiang [6] (word lattice reranking)

Αλγόριθμοι και Πολυπλοκότητα

3: A convolution-pooling layer in PS-CNN 1: Partially Shared Deep Neural Network 2.2 Partially Shared Convolutional Neural Network 2: A hidden layer o

Discovery of multi-target receptor tyrosine kinase inhibitors as novel anti-angiogenesis agents

Discontinuous Hermite Collocation and Diagonally Implicit RK3 for a Brain Tumour Invasion Model

Mesh. QoS Routing Algorithm and Performance Evaluation Based on Dynamic Programming Method in Wireless Mesh Networks

Development of a Seismic Data Analysis System for a Short-term Training for Researchers from Developing Countries

Κεφάλαιο 1 Αφαιρετικότητα και Τεχνολογία Υπολογιστών (Computer Abstractions and Technology)

GPU GPU GPU GPU. GPU (Graphics Processing Unit) GPU GPU GPU AGPU [11] AGPU. GPGPU (general-purpose GPU) GPU GPU AGPU GPU

ΠΡΟΣΚΛΗΣΗ ΕΚΔΗΛΩΣΗΣ ΕΝΔΙΑΦΕΡΟΝΤΟΣ

ΤΕΙ ΚΑΒΑΛΑΣ ΤΜΗΜΑ ΒΙΟΜΗΧΑΝΙΚΗΣ ΠΛΗΡΟΦΟΡΙΚΗΣ SYMBIAN OS (ΠΡΟΓΡΑΜΜΑΤΙΣΜΟΣ ΚΙΝΗΤΩΝ ΤΗΛΕΦΩΝΩΝ) ΕΠΙΒΛΕΠΟΝ ΚΑΘΗΓΗΤΗΣ: Δρ. ΠΑΠΑΔΑΚΗΣ ΣΤΥΛΙΑΝΟΣ

Transcript:

DEIM Forum 2011 D3-1 Random walk with restart Top-k, 230 047 1-1 230 047 1-1 263 505 4-6-1 E-mail: {fujiwara.yasuhiro,nakatsuji.makoto,onizuka.makoto}@lab.ntt.co.jp, kitsure@tkl.iis.u-tokyo.ac.jp Random walk with restart (RWR) RWR K Random walk with restart Top-k Efficient Top-k Search for Random Walk with Restart Yasuhiro FUJIWARA,, Makoto NAKATSUJI, Makoto ONIZUKA, and Masaru KITSUREGAWA NTT Cyber Space Laboratories, 1-1 Hikarinooka, Yokosuka, Kanagawa, 230 047 Japan NTT Cyber Solution Laboratories, 1-1 Hikarinooka, Yokosuka, Kanagawa, 230 047 Japan Institute of Industrial Science, The University of Tokyo, Komaba 4-6-1, Meguro, Tokyo 263 505 Japan E-mail: {fujiwara.yasuhiro,nakatsuji.makoto,onizuka.makoto}@lab.ntt.co.jp, kitsure@tkl.iis.u-tokyo.ac.jp 1. [1], [2], [3] Random walk with restart (RWR) RWR [4] RWR q q RWR u q q u RWR [5], [6], [7], [] [], [9] q [] q K q K

2. 3. RWR 4. 5. 6. 2. RWR Pan [5] RWR 10% [10] Konstas RWR [6] Konstas Sun RWR RWR [] RWR RWR 0 Tong RWR B LIN NB LIN [9] RWR NB LIN RWR Sun RWR Tong O(n 2 ) O(n 2 ) 3. Random walk with restart RWR 1 RWR RWR q [4] c p n 1 p u u 1 q K n m c p u n 1 q q 1 0 n 1 A q n 1 q q 1 0 A A u,v u v p = (1 c)ap + cq (1) RWR p p u u q t RWR O(mt) RWR RWR [6] 4. 4. 1 4. 2 4. 3 4. 4 4. 1 3. (1) O(n 2 ) LU

[11] K RWR O(1) 4. 2 4. 2. 1 (1) p = c{i (1 c)a} 1 q = cw 1 q (2) I W = I (1 c)a W 1 W [12] W LU W = LU p = cu 1 L 1 q (3) L 1 U 1 L 1 U 1 L U L 1 U 1 [12] >< L 1 ij = >: U 1 ij = >< >: 0 (i < j) 1/L ij (i = j) 1/L ii P i 1 k=j L ikl 1 kj (i > j) 0 (i > j) 1/U ij (i = j) 1/U ii P j k=i+1 U iku 1 kj (i < j) L U W [12] >< 0 (i < j) L ij = 1 (i = j) >: 1/U jj W ij P j 1 k=1 L iku kj (i > j) >< 0 (i > j) U ij = W ij (i < = j i = 1) >: W ij P i 1 k=1 L iku kj (i < = j i = 1) (4), (5), (6), (7) L 1, U 1, L, U L 1 ij L L 1 L ij W, L, U (4) (5) (6) (7) (1) L 1 ij U 1 ij L U 0 0 (2) L U W 0 0 (3) W A 0 0 A 0 L 1 ij U 1 ij A 0 Newman clustering [13] κ Newman clustering κ+1 1 κ κ+1 A 1 κ κ+1

(1) (2) (3) 1. 1 A 0 [11] 4. 3 4. 3. 1 0 1 i i V V s u l u l u V (l u) V (l u) = {v : (v V s) (l v = l u)} A A max A max = max{a ij : i, j V } u A max(u) A max(u) = max{a iu : i V } A max A max(u) 4. 3. 2 u p u 1 q u p u < p u = c : X v V (l u 1) p va max(v) + X v V (l u) + 1 X p va max(v) v V s p v! A max ) () c = (1 c)/(1 A uu + ca uu) u p u = 1 1 O(n) V (l u 1) V (l u) V s O(n) O(1) 4. 3. 3 1 u p u > = p u 2 l u < = l v u v p u > = p v 2 4. 3. 3 4. 3. 2 1 O(n) u u u u p u,1 p u,2 p u,3 () p u = c ( p u,1 + p u,2 + p u,3) u 2 u ( pu p u,1 =,1 if l(u) = l(u ) p u,2 + p u A max(u ) otherwise ( pu p u,2 =,2 + p u A max(u ) if l(u) = l(u ) (9) 0 otherwise p u,3 = ( p u,3/a max p u ) A max u p u,1 = p qa max(q) p u,2 = 0 p u,3 = (1 p q)a max(u) 3 2 u

Algorithm 1 Input: q, K, L 1, L U 1, U Output: V a, 1: θ = 0; 2: V s = ; 3: V a = ; 4: K V a ; 5: q ; 6: while V s = V do 7: u := argmin(l v v V \V s); : u p u ; 9: if p u < θ then 10: return V a; 11: else 12: L 1 U 1 p u ; 13: if p u > θ then 14: v := argmin(p w w V a); 15: v V a ; 16: u V a ; 17: θ := min(p w w V a); 1: end if 19: end if 20: u V s ; 21: end while 22: return V a; RWR O(1) 4. 4 1 K θ V a K 0 θ 1 θ 2 θ V a θ 5. Tong NB LIN [9] NB LIN 3. Sun [] Tong B LIN Tong c 0.95 [9], [14] Wall clock time [s] 10 0 10-1 10-2 10-3 10-4 10-5 10-6 Proposed(5) Proposed(25) Proposed(50) NB_LIN(100) NB_LIN(1,000) 2 Dictionary Internet Citation Dictionary 1 : FOLDOC 2 u v u v 13, 356 120, 23 Internet 3 : Oregon Route Views Project 4 BGP 22, 963 4, 436 Citation 5 : Condensed Matter E-Print 6 31, 163 120, 029 CPU Intel Xeon Quad-Core 3.33GHz 32GB Linux GCC 5. 1 NB LIN 2 K Propased(K) NB LIN 100 NB LIN(100) 1, 000 NB LIN(1,000) NB LIN K NB LIN 5. 2 1 http://vlado.fmf.uni-lj.si/pub/networks/data/dic/foldoc/foldoc.zip 2 http://foldoc.org/ 3 http://www-personal.umich.edu/ mejn/netdata/as-22july06.zip 4 http://routeviews.org/ 5 http://www-personal.umich.edu/ mejn/netdata/cond-mat-2003.zip 6 http://arxiv.org/archive/cond-mat

Precision 3 Number of non-zero elements 10 9 10 10 7 10 6 10 5 5 1 0. 0.6 0.4 0.2 0 100 400 700 1000 Target rank of SVD NB_LIN Proposed Degree Cluster Hybrid Random Dictionary Internet Citation Wall clock time [s] 4 Wall clock time [s] 10-1 10-2 10-3 10-4 10-5 10-6 10-2 10-3 10-4 10-5 10-6 6 100 400 700 1000 Target rank of SVD NB_LIN Proposed Proposed Without pruning Dictionary Internet Citation 3 4 NB LIN NB LIN Dictionary 3 1 NB LIN 4 NB LIN NB LIN 5. 3 5. 3. 1 5 Degree Clustering Hybrid Random O(m) 5. 3. 2 4. 3 6 Without pruning 1, 020 5. 4 NB LIN OS Dictionary NB LIN 1, 000 2 Microsoft Windows Microsoft Windows W2K Windows/36 Windows 3.0 Windows 3.11 Microsoft OS Microsoft OS Mac OS Apple Macintosh user interface Apple PC GUI Macintosh file system Mac OS Linux Linux Linux Documentation Project NB LIN 6. RWR [1] Y. Koren, S. C. North and C. Volinsky: Measuring and extracting proximity in networks, KDD, pp. 245 255 (2006). [2] H. Tong, C. Faloutsos and Y. Koren: Fast direction-aware proximity for graph mining, KDD, pp. 747 756 (2007). [3] D. Lizorkin, P. Velikhov, M. N. Grinev and D. Turdakov: Accuracy estimate and optimization techniques for simrank computation, PVLDB, 1, 1, pp. 422 433 (200). [4] H. Tong and C. Faloutsos: Center-piece subgraphs: problem definition and fast solutions, KDD, pp. 404 413 (2006). [5] J.-Y. Pan, H.-J. Yang, C. Faloutsos and P. Duygulu: Automatic multimedia cross-modal correlation discovery, KDD, pp. 653 65 (2004). [6] I. Konstas, V. Stathopoulos and J. M. Jose: On social networks and collaborative recommendation, SIGIR, pp. 195 202 (2009).

2 NB LIN Microsoft Windows, Mac OS, Linux. Microsoft Windows Mac OS Linux 1 2 3 4 5 Proposed Microsoft Windows W2K Windows/36 Windows 3.0 Windows 3.11 NB LIN Microsoft Windows Microsoft Networking Proposed Mac OS Macintosh user interface Microsoft Network W2K Thumb Macintosh file system multitasking Proposed Linux Linux Documentation Project NB LIN Linux Linux Documentation Project NB LIN Mac OS Rhapsody SORCERER Macintosh Operating System Macintosh Operating System PowerOpen Association Unix lint Linux Network Administrators Guide SL5 debianize SLANG [7] D. Liben-Nowell and J. M. Kleinberg: The link prediction problem for social networks, CIKM, pp. 556 559 (2003). [] J. Sun, H. Qu, D. Chakrabarti and C. Faloutsos: Neighborhood formation and anomaly detection in bipartite graphs, ICDM, pp. 41 425 (2005). [9] H. Tong, C. Faloutsos and J.-Y. Pan: Fast random walk with restart and its applications, ICDM, pp. 613 622 (2006). [10] J. L. Herlocker, J. A. Konstan, A. Borchers and J. Riedl: An algorithmic framework for performing collaborative filtering, SIGIR, pp. 230 237 (1999). [11] T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein: Introduction to Algorithms, The MIT Press (2009). [12] W. H. Press, S. A. Teukolsky, W. T. Vetterling and B. P. Flannery: Numerical Recipes 3rd Edition, Cambridge University Press (2007). [13] A. Clauset, M. E. J. Newman and C. Moore: Finding community structure in very large networks, Physical Review E, pp. 1 6 (2004). [14] J. He, M. Li, H. Zhang, H. Tong and C. Zhang: Manifoldranking based image retrieval, ACM Multimedia, pp. 9 16 (2004).