Web. Web p OutDegree(p) log 7 1/OutDegree(p) A New Difinition of Subjective Distance between Web Pages

Σχετικά έγγραφα


ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΑΤΡΩΝ ΠΟΛΥΤΕΧΝΙΚΗ ΣΧΟΛΗ ΤΜΗΜΑ ΜΗΧΑΝΙΚΩΝ Η/Υ & ΠΛΗΡΟΦΟΡΙΚΗΣ. του Γεράσιμου Τουλιάτου ΑΜ: 697

Αναζήτηση στο ιαδίκτυο

Εισαγωγή στην ανάλυση συνδέσμων

Newman Modularity Newman [4], [5] Newman Q Q Q greedy algorithm[6] Newman Newman Q 1 Tabu Search[7] Newman Newman Newman Q Newman 1 2 Newman 3

User Behavior Analysis for a Large2scale Search Engine

Automatic extraction of bibliography with machine learning

IPSJ SIG Technical Report Vol.2014-CE-127 No /12/6 CS Activity 1,a) CS Computer Science Activity Activity Actvity Activity Dining Eight-He

A Method for Creating Shortcut Links by Considering Popularity of Contents in Structured P2P Networks

[4] 1.2 [5] Bayesian Approach min-max min-max [6] UCB(Upper Confidence Bound ) UCT [7] [1] ( ) Amazons[8] Lines of Action(LOA)[4] Winands [4] 1

SocialDict. A reading support tool with prediction capability and its extension to readability measurement

ST5224: Advanced Statistical Theory II

Kenta OKU and Fumio HATTORI

n 1 n 3 choice node (shelf) choice node (rough group) choice node (representative candidate)

Anomaly Detection with Neighborhood Preservation Principle

Ανάκτηση Πληροφορίας

Ερευνητική+Ομάδα+Τεχνολογιών+ Διαδικτύου+

Vol. 31,No JOURNAL OF CHINA UNIVERSITY OF SCIENCE AND TECHNOLOGY Feb

An Automatic Modulation Classifier using a Frequency Discriminator for Intelligent Software Defined Radio


GPU. CUDA GPU GeForce GTX 580 GPU 2.67GHz Intel Core 2 Duo CPU E7300 CUDA. Parallelizing the Number Partitioning Problem for GPUs

Περιεχόμενα. Εξόρυξη γνώσης από δεδομένα στον Παγκόσμιο Ιστό

Web DEIM Forum 2009 A7-1. Web. Web. Web. Web. 4 Wikipedia. Wikipedia. Web.

Πανεπιστήμιο Κρήτης, Τμήμα Επιστήμης Υπολογιστών Άνοιξη HΥ463 - Συστήματα Ανάκτησης Πληροφοριών Information Retrieval (IR) Systems

Web 論 文. Performance Evaluation and Renewal of Department s Official Web Site. Akira TAKAHASHI and Kenji KAMIMURA

Ψηφιακή ανάπτυξη. Course Unit #1 : Κατανοώντας τις βασικές σύγχρονες ψηφιακές αρχές Thematic Unit #1 : Τεχνολογίες Web και CMS

{takasu, Conditional Random Field

Πανεπιστήµιο Πειραιώς Τµήµα Πληροφορικής

Wiki. Wiki. Analysis of user activity of closed Wiki used by small groups

Maxima SCORM. Algebraic Manipulations and Visualizing Graphs in SCORM contents by Maxima and Mashup Approach. Jia Yunpeng, 1 Takayuki Nagai, 2, 1


4.6 Autoregressive Moving Average Model ARMA(1,1)

I. Μητρώο Εξωτερικών Μελών της ημεδαπής για το γνωστικό αντικείμενο «Μη Γραμμικές Ελλειπτικές Διαφορικές Εξισώσεις»

Reading Order Detection for Text Layout Excluded by Image

Schedulability Analysis Algorithm for Timing Constraint Workflow Models

ΑΝΙΧΝΕΥΣΗ ΓΕΓΟΝΟΤΩΝ ΒΗΜΑΤΙΣΜΟΥ ΜΕ ΧΡΗΣΗ ΕΠΙΤΑΧΥΝΣΙΟΜΕΤΡΩΝ ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ

Research on Economics and Management

Homomorphism in Intuitionistic Fuzzy Automata

Liner Shipping Hub Network Design in a Competitive Environment

Probabilistic Approach to Robust Optimization

ΤΕΧΝΟΛΟΓΙΚΟ ΕΚΠΑΙΔΕΥΤΙΚΟ ΙΔΡΥΜΑ ΚΡΗΤΗΣ. Σχολή Τεχνολογικών Εφαρμογών Τμήμα Εφαρμοσμένης Πληροφορικής & Πολυμέσων

ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ. «Προστασία ηλεκτροδίων γείωσης από τη διάβρωση»

Other Test Constructions: Likelihood Ratio & Bayes Tests

Toward a SPARQL Query Execution Mechanism using Dynamic Mapping Adaptation -A Preliminary Report- Takuya Adachi 1 Naoki Fukuta 2.

Potential Dividers. 46 minutes. 46 marks. Page 1 of 11

Shortness Ambiguity TEAM Ungrammaticality

ΠΑΝΕΠΙΣΤΗΜΙΟ ΜΑΚΕΔΟΝΙΑΣ

Ανάκτηση Πληροφορίας

Ανάκτηση Πληροφορίας

3: A convolution-pooling layer in PS-CNN 1: Partially Shared Deep Neural Network 2.2 Partially Shared Convolutional Neural Network 2: A hidden layer o

Retrieval of Seismic Data Recorded on Open-reel-type Magnetic Tapes (MT) by Using Existing Devices

Μηχανισμοί πρόβλεψης προσήμων σε προσημασμένα μοντέλα κοινωνικών δικτύων ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ

MIDI [8] MIDI. [9] Hsu [1], [2] [10] Salamon [11] [5] Song [6] Sony, Minato, Tokyo , Japan a) b)

: Monte Carlo EM 313, Louis (1982) EM, EM Newton-Raphson, /. EM, 2 Monte Carlo EM Newton-Raphson, Monte Carlo EM, Monte Carlo EM, /. 3, Monte Carlo EM

Δημιουργία μιας επιτυχημένης παρουσίας στο διαδίκτυο

Ψηφιακή ανάπτυξη. Course Unit #1 : Κατανοώντας τις βασικές σύγχρονες ψηφιακές αρχές Thematic Unit #1 : Τεχνολογίες Web και CMS

Buried Markov Model Pairwise

ES440/ES911: CFD. Chapter 5. Solution of Linear Equation Systems

Resurvey of Possible Seismic Fissures in the Old-Edo River in Tokyo

HOSVD. Higher Order Data Classification Method with Autocorrelation Matrix Correcting on HOSVD. Junichi MORIGAKI and Kaoru KATAYAMA

Re-Pair n. Re-Pair. Re-Pair. Re-Pair. Re-Pair. (Re-Merge) Re-Merge. Sekine [4, 5, 8] (highly repetitive text) [2] Re-Pair. Blocked-Repair-VF [7]

Fourier transform, STFT 5. Continuous wavelet transform, CWT STFT STFT STFT STFT [1] CWT CWT CWT STFT [2 5] CWT STFT STFT CWT CWT. Griffin [8] CWT CWT

Exhaustive Topic Detection and Query Expansion Support Based on Substance-Oriented Term Clustering

ΚΒΑΝΤΙΚΟΙ ΥΠΟΛΟΓΙΣΤΕΣ

Approximation of distance between locations on earth given by latitude and longitude

ΔΘΝΙΚΗ ΥΟΛΗ ΓΗΜΟΙΑ ΓΙΟΙΚΗΗ ΙΗ ΔΚΠΑΙΓΔΤΣΙΚΗ ΔΙΡΑ

ΠΟΛΥΤΕΧΝΕΙΟ ΚΡΗΤΗΣ ΣΧΟΛΗ ΜΗΧΑΝΙΚΩΝ ΠΕΡΙΒΑΛΛΟΝΤΟΣ

2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems

ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΕΙΡΑΙΩΣ ΤΜΗΜΑ ΝΑΥΤΙΛΙΑΚΩΝ ΣΠΟΥΔΩΝ ΠΡΟΓΡΑΜΜΑ ΜΕΤΑΠΤΥΧΙΑΚΩΝ ΣΠΟΥΔΩΝ ΣΤΗ ΝΑΥΤΙΛΙΑ ΑΤΥΧΗΜΑΤΑ ΚΑΙ ΑΣΦΑΛΕΙΑ ΣΤΗ ΝΑΥΤΙΛΙΑ.

ΓΕΩΜΕΣΡΙΚΗ ΣΕΚΜΗΡΙΩΗ ΣΟΤ ΙΕΡΟΤ ΝΑΟΤ ΣΟΤ ΣΙΜΙΟΤ ΣΑΤΡΟΤ ΣΟ ΠΕΛΕΝΔΡΙ ΣΗ ΚΤΠΡΟΤ ΜΕ ΕΦΑΡΜΟΓΗ ΑΤΣΟΜΑΣΟΠΟΙΗΜΕΝΟΤ ΤΣΗΜΑΣΟ ΨΗΦΙΑΚΗ ΦΩΣΟΓΡΑΜΜΕΣΡΙΑ

Development of the Nursing Program for Rehabilitation of Woman Diagnosed with Breast Cancer

ER-Tree (Extended R*-Tree)

EM Baum-Welch. Step by Step the Baum-Welch Algorithm and its Application 2. HMM Baum-Welch. Baum-Welch. Baum-Welch Baum-Welch.

SCITECH Volume 13, Issue 2 RESEARCH ORGANISATION Published online: March 29, 2018

Αλγεβρικές Υπερομάδες και Διαδρομές Ελαχίστου Μήκους σε Γραφήματα

MDSR. Proposition and Evaluation of MDSR Method for Core Analysis of Multiple Directed Graphs

Εκτεταμένη περίληψη Περίληψη

ΟΙ Υ ΡΟΓΕΩΛΟΓΙΚΕΣ ΣΥΝΘΗΚΕΣ ΣΤΗΝ ΛΕΚΑΝΗ ΠΟΤΑΜΙΑΣ ΚΑΙ Η ΑΛΛΗΛΟΕΠΙ ΡΑΣΗ ΤΟΥ Υ ΑΤΙΚΟΥ ΚΑΘΕΣΤΩΤΟΣ ΜΕ ΤΗ ΜΕΛΛΟΝΤΙΚΗ ΛΙΓΝΙΤΙΚΗ ΕΚΜΕΤΑΛΛΕΥΣΗ ΣΤΗΝ ΕΛΑΣΣΟΝΑ

ΤΕΧΝΟΛΟΓΙΚΟ ΕΚΠΑΙΔΕΥΤΙΚΟ ΙΔΡΥΜΑ ΚΡΗΤΗΣ ΣΧΟΛΗ ΔΙΟΙΚΗΣΗΣ ΚΑΙ ΟΙΚΟΝΟΜΙΑΣ (ΣΔΟ) ΤΜΗΜΑ ΛΟΓΙΣΤΙΚΗΣ ΚΑΙ ΧΡΗΜΑΤΟΟΙΚΟΝΟΜΙΚΗΣ

TMA4115 Matematikk 3

Topic Structure Mining based on Wikipedia and Web Search

CHAPTER 25 SOLVING EQUATIONS BY ITERATIVE METHODS

τεχνική περιγραφή e-situ

CHAPTER 101 FOURIER SERIES FOR PERIODIC FUNCTIONS OF PERIOD

ΕΙΣΑΓΩΓΗ ΣΤΗ ΣΤΑΤΙΣΤΙΚΗ ΑΝΑΛΥΣΗ

ADVANCED STRUCTURAL MECHANICS

ΤΟ ΜΟΝΤΕΛΟ Οι Υποθέσεις Η Απλή Περίπτωση για λi = μi 25 = Η Γενική Περίπτωση για λi μi..35

Distances in Sierpiński Triangle Graphs

Efficient Top-k Search for Random Walk with Restart

1. Εισαγωγή. Περιγραφή Μαθήματος. Ιστορική Αναδρομή. Ορισμοί Ηλεκτρονικού Εμπορίου

Simplex Crossover for Real-coded Genetic Algolithms

Μελέτη Περίπτωσης: Random Surfer

Διδάσκων: Κωνσταντίνος Κώστα Διαφάνειες: Δημήτρης Ζεϊναλιπούρ

GPGPU. Grover. On Large Scale Simulation of Grover s Algorithm by Using GPGPU

Bayesian modeling of inseparable space-time variation in disease risk

Study of urban housing development projects: The general planning of Alexandria City

Phys460.nb Solution for the t-dependent Schrodinger s equation How did we find the solution? (not required)

Δομές Δεδομένων & Αλγόριθμοι

ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ

Transcript:

Vol. 44 No. 1 Jan. 2003 Web 1 2, 3 4 Web p OutDegree(p) log 7 1/OutDegree(p) A New Difinition of Subjective Distance between Web Pages Yutaka Matsuo, 1 Yukio Ohsawa 2, 3 and Mitsuru Ishizuka 4 The pages and hyperlinks of the World Wide Web may be viewed as nodes and edges in a directed graph. In this paper, we propose a new definition of the distance between two pages, called average-clicks. It is based on the probability to click a link through random surfing. We compare the average-clicks measure to the classical measure of clicks between two pages, and show the average-clicks fits better to the users intuitions of distance. 1. World Wide Web Web 2) Web Google PageRank 5) hub authority Kleinberg 10) Web 12) 2 1 National Institute of Advanced Industrial Science and Technology 2 Japan Science and Technology Corporation 3 The University of Tsukuba 4 The University of Tokyo http://www.google.com/ 14) 8) 1 7) 1) Web 19 outlink outlink 11) outlink 1

2 Jan. 2003 (clicks) (average-clicks) Web 2 3 4 2. 2.1 p OutDegree(p) URL OutDegree(p) log(outdegree(p)) (1) 1 1 a 1 2 a 2 1 2 log a =loga 2 18) (1) 2.1 () p log n (OutDegree(p)) (2) OutDegree(p) html 7 3) log n 7 2.2 () n =7 (2) p q 2.3 () p q p q p q 1 A 3 log 7 (1/3) 0.56 B 7 1 A C 1.56 A D 2 F 1 0 Yahoo! 1 4 15) 7 e 2 10 7

Vol. 44 No. 1 Web 3 1 -log7 -=0.56 3 A B 1 -log7 -=1.0 7 1.0 C From A to C, 1.56 average-click, 2 clicks. b c a e a b c e 0.56 E 0.36 F 0 D From A to D, 0.92 average-click, 2 clicks. d d 1 2 180 Yahoo! Yahoo! 1 1 2.2 2.1 ( r ) a r (2 7 r 1) p 1 2.1 a r a a ( 2) r a 1/7 r 7 r 2 2 7 r 1 2 97 function Search Shortest Path (s, t, d thre ) α 1.0, n 7. list Add List(s, empty), d(s) 0. p s while p t Fetch page p and extract links which points to page p k (k =1,...,n p) for k 1 to n p d(p k ) d(p) log n(α/n p) if d(p k ) >d thre then next list Add List(p k,list) end if list is empty return failure p Choose Minimal(list, d) end return d(t) Add List(a, list) list a. Choose Minimal(list, d) d(a) a list. d thre. 3 3 685 4 4801 query focused crawling 7) 3. Web

4 Jan. 2003 1. To URL / 1.62 / 2 http://www.miv.t.u-tokyo.ac.jp/ matumura/ 1.62 http://www.miv.t.u-tokyo.ac.jp/jaico/ 1.13 Yahoo! 3.02 / 3 http://www.yahoo.co.jp/ 3.02 http://www.geocities.co.jp/.../6353/whatsnew.html 2.67 http://www.geocities.co.jp/athlete-athene/6353/ 1.13 http://www.miv.t.u-tokyo.ac.jp/ matsuo 0.0 5.15 / 4 http://www.ipsj.or.jp/ 5.15 http://wwwsoc.nacsis.ac.jp/jsai/links.html 2.75 http://wwwsoc.../jsai/whatsai/navigation.html 1.75 http://wwwsoc.nacsis.ac.jp/jsai/whatsai/ 1.18 http://www.miv.t.u-tokyo.ac.jp/ matsuo/ 0.0 IJCAI 5.39 / 5 http://ijcai.org/ 5.39 http://w3.sys.es.osaka-u.ac.jp/~osawa/ailinks.html 3.33 http://www.gssm.otsuka.tsukuba.ac.jp/staff/osawa 1.97 http://www.miv.../~matumura/research.html 1.62 http://www.miv.t.u-tokyo.ac.jp/jaico/ 1.13 J 4.03 / 3 http://www.j-league.or.jp/index.html 4.03 http://www.miv.t.u-tokyo.ac.jp/~tomobe/ 2.50 http://www.miv.../member/present-mem.htm 1.13 9.08 / 5 http://www.wnn.or.jp/wnn-t/ 9.08 http://www.ntt.co.jp/square/www-in-jp.html 6.07 http://www.ic.u-tokyo.ac.jp/index.html 4.30 http://www.u-tokyo.ac.jp/ 2.67 http://www.geocities.co.jp/athlete-athene/6353/ 1.13 3.1 2 s t 3 1 2 1 2 1 16) 2 http://www.miv.t.u-tokyo.ac.jp/ matsuo/ Yahoo! IJCAI J 3.2 30 URL URL 5 1 5 URL 1 4, 5 13 2 3 7) 4 3.3 5 3 13 6 10 3 4 7) 1 4

Vol. 44 No. 1 Web 5 2 4 1 0.524 0.623 0.836 2 0.325 0.435 0.804 3 0.696 0.599 0.715 4 0.517 0.626 0.699 5 0.471 0.483 0.685 6 0.641 0.662 0.674 7 0.268 0.502 0.572 8 0.490 0.482 0.569 9 0.499 0.517 0.528 10 0.435 0.568 0.512 11 0.358 0.477 0.436 12 0.116 0.108 0.400 13 0.302 0.358 0.367 0.434 0.495 0.600 3-0.320-0.359-0.404 0.279 4.37 0.462-0.174-0.269-0.174 5??? 3 4. 5) p OutDegree(p) α/outdegree(p) 1 α ( URL Web α 0 1 Google PageRank Web 17)

6 Jan. 2003 log n (α/outdegree(p)) α Web PageRank α =0.85 α =1 α <1 1 0 α<1 9) 4) 19) 6) href 1 Web 3) Web Web 5. 13) 1) Adamic, L. A.: The Small World Web, Proc. ECDL 99, pp. 443 452 (1999). 2) Adamic, L. A. and Adar, E.: Friends and Neighbors on the Web (to appear). URL://www.hpl.hp.com/ shl/papers/web10/fnn.pdf. 3) Bharat, K. and Broder, A.: A technique for measuring the relative size and overlap of public Web search engines, Proc. 7th WWW Conf. (1998). 4) Botafogo, R. A., Rivlin, E. and Schneiderman, B.: Structural Analysis of Hypertexts: Identifying Hierarchies and Usefule Metrics, ACM Transaction on Information Systems, Vol. 10, No. 2 (1992). 5) Brin, S. and Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine, Proc. 7th WWW Conf. (1998).

Vol. 44 No. 1 Web 7 6) Chakrabarti, S., Dom, B., Raghavan, P., Rajagopalan, S., Gibson, D. and Kleinberg, J.: Automatic resource compilation by analyzing hyperlink structure and associated text, Proc. 7th WWW Conf. (1998). 7) Chakrabarti, S., van den Berg, M. and Dom, B.: Focused crawling: a new approach to topic-specific Web resource discovery, Proc. 8th WWW Conf. (1999). 8) Flake, G. W., Lawrence, S. and Giles, C. L.: Efficient Identification of Web Communities, Proc. ACM SIGKDD-2000, pp. 150 160 (2000). 9) :,, chapter 6 (1996). 10) Kleinberg, J.: The Small-World Phenomenon: An Algorithmic Perspective, Technical Report TR 99-1776, Cornell University (1999). 11) Kleinberg, J. M., Kumar, R., Raghavan, P., Rajagopalan, S. and Tomkins, A. S.: The Web as a graph: measurements, models, and methods, Proc. International Conf. on Combinatorics and Computing (1999). 12) Kumar, S. R., Raghavan, P., Rajagopalan, S. and Tokins, A.: Trawling the web for emerging cyber communities, Proc. 8th WWW Conf. (1999). 13) Lawrence, S.: Context in Web Search, IEEE Data Engineering Bulletin, Vol. 23, No. 3, pp. 25 32 (2000). 14) : Web,, Vol. 16, No. 3, pp. 316 323 (2001). 15) Murray, B. and Moore, A.: Sizing the Internet, White paper, Cyveillance, Inc. (2000). (http://www.cyveillance.com). 16)Najork,M.andWiener,J.L.:Breadth-first search crawling yields high-quality pages, Proc. 10th WWW Conf. (2001). 17) Page, L., Brin, S., Motwani, R. and Winograd, T.: The PageRank Citation Ranking: Bringing order to the Web, Technical report, Stanford University (1998). 18) Shannon, C. E.: A mathematical theory of communication, Bell System Technical Journal, Vol. 27, pp. 379 423 and 623 656 (1948). 19) Spertus, E.: ParaSite: Mining Structural Information on the Web, Proc. 6th WWW Conf. (1997). ( 2002 3 20 ) ( 2002 11 6 ) 1997 2002 2002 AAAI 1990 1995 AAAI IEEE 1994 1998 1998 1971 1976 NTT 1978 1992 2001 IEEE AAAI