DEIM Forum 2012 D2-1 606 8501 150 0002 2-15-1 28F E-mail: {tsukuda,ohshima,tanaka}@dl.kuis.kyoto-u.ac.jp, {miyamamoto,hiwasaki}@d-itlab.co.jp 1 Wikipedia Wikipedia HITS 1. Web Web Web 1 3 Wikipedia 2 Web 2 1 2 Web Wikipedia ( ) ( ) Web ( ) Web 2. Web
[1] Wikipedia Wikipedia [2] SNS Liu [3] 2 Web Web Web Liu Web Web [4] [5] [6] [7] [8] Web [9] Sarwar [10] Kamahara [11] [12] 3. Web 1 1 4 2 3 2
2 5 3 6 4 4 5 6 t e t e Rel(t, e) e Cog(e) f : Unexp(t, e) = f(rel(t, e), Cog(e)) (1) t e 4. 1 t t E t = {e 1, e 2, e n } 2 t 3 t Rel(t, e i ) 4 Cog(e i ) 5 Unexp(t, e i ) 4. 1 t t Web t t Web QA t
t Wikipedia Wikipedia 2 1 t Web Wikipedia 2 t QA Web Wikipedia Wikipedia Wikipedia Wikipedia Wikipedia Wikipedia 4. 2 3 ALAGIN 1 Wikipedia 45 45 1 4. 3 t e i E t t e i e i t e i t e e 1http://nlpwww.nict.go.jp/corpus/ 2 2 3 2 2 HITS [13] 2 4. 3. 1 q t hyper(t) t hypo(t) t rel(t) Q {q} H q {x x hyper(q)} C q {x x hypo(y), y H q} L q {x x rel(q)} H lq {x x hyper(y), y L q} L c {x x rel(y), y C q, x / L q} 2 (n 1, n 2) n 1 n 2 (q, x) where x H q (x, y) where x H q, y C q, and y = hypo(x) (x, y) where x C q, y L c, and y = rel(x) (x, y) where x C q, y L q, and y = rel(x) (x, y) where x L c, y H lq, and y = hyper(x) (x, y) where x H lq, y L q, and x = hyper(y) q x L q 4. 3. 2 q 2 G 1 = (H q T, E 1) T = Q C q E 1 H q T h i H q t j T h i t j 2 HITS C q q h i x i t j y j x i y j x i = t j T w th ji y j (2)
y j = h i H q w ht ij x i (3) wji th wij ht wji th t j h i HITS 1 2 HITS SALSA [14] SALSA h i w ht ij = 1 hypo(h i ) 4. 3. 3 1 C q L q L c 2 G 2 = (C q L, E 2 ) L = L q L c E 2 C q L Wikipedia c i C q l j L c i l j 2 HITS c i C q SALSA 4.3.2 C q Co-HITS [15] x 0 i c i y 0 j l j c i x i l j y j x i y j x i = (1 λ c )x 0 i + λ c y j = (1 λ l )y 0 j + λ l l j L λ c [0.1] λ l [0.1] x 0 i y 0 j w lc jiy j (4) wijx cl i (5) c i C q x 0 i y 0 j 4. 3. 4 2 L q L c H lc 2 G 3 = (L H lc, E 3) E 3 L H lc l i L h j H lc l i h j 2 4.3.3 Co-HITS SALSA 3 e i t e i Rel(t, e i) 4. 4 Wikipedia PageRank [16] Yahoo! API 2 Web e i Web Cog(e i) 4. 5 Unexp(t, e i) Unexp(t, e i) = f(rel(t, e i), Cog(e i)) 1 = log10cog(ei) (6) Rel(t, e i) 5. Web Yahoo! API Web k LexRank [17] LexRank MeCab 3 tf idf p LexRank p p = [du + (1 d)b] T p U 1 k B k B ij, i j 1 k d dampingfactor 2http://developer.yahoo.co.jp/webapi/search/websearch/v1/websearch.html 3http://mecab.sourceforge.net/
1 HITS SALSA HITS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 d = 0.15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 LexRank MMR [18] MMR MMR = argmax[λ(score(d i)) (1 λ) max Sim(di, dj)] d i D\S d j S D Score(d i ) LexRank d i S D Sim(d i, d j ) d i d j λ [0, 1] λ = 0.5 MMR r Web 1 6. 6. 1 4.3.2 HITS 1 32 9843 HITS 10 22 23 2 1 1 1 1 1 1 6 7 8 9 10 3 2 1 2 3 4 5 6 7 8 9 10 100 4.3.3 4.3.4 2 3 433 2 5 2 3 3 4.4 4.5 4 6. 2 Web 4 3 5 Web 1 Web 100 5 3
4 1 2 3 4 5 6 7 8 9 10 5 6 5 2 5 k = 100 r = 5 1 1 3 2 Web 2 26 Wikipedia Wikipedia Web 7. Web COE. [1] Y. Noda, Y. Kiyota and H. Nakagawa: Proc. of 4th Int l AAAI Conference on Weblogs and Social Media, ICWSM 10. [2],,,,., 2007, 65, pp. 265 270 (2007). [3] B. Liu, Y. Ma and P. S. Yu: Discovering unexpected information from your competitors web sites, Proc. of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, KDD 01, pp. 144 153 (2001). [4] B. Padmanabhan and A. Tuzhilin: Unexpectedness as a measure of interestingness in knowledge discovery, Decis. Support Syst., 27, pp. 303 318 (1999). [5] B. Liu and W. Hsu: Post-analysis of learned rules, Proc. of the thirteenth national conference on Artificial intelligence - Volume 1, AAAI 96, pp. 828 834 (1996). [6] A. Tuzhilin: On subjective measures of interestingness in knowledge discovery, Proc. of the First International Conference on Knowledge Discovery and Data Mining, pp. 275 281 (1995). [7] B. Padmanabhan and A. Tuzhilin: Small is beautiful: discovering the minimal set of unexpected patterns, Proc. of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD 00, pp. 54 63 (2000). [8] B. Padmanabhan and A. Tuzhilin: A belief-driven method for discovering unexpected patterns, KDD, pp. 94 100 (1998). [9] K. Swearingen and R. Sinha: Beyond algorithms: An hci perspective on recommender systems, Proc. of the 24th international ACM SIGIR conference on Research and development in information retrieval, SIGIR 01, pp. 393 408 (2001). [10] B. Sarwar, G. Karypis, J. Konstan and J. Reidl: Itembased collaborative filtering recommendation algorithms,
5 54 () W 1889 20 1906 29 H5 6 6 2 8 15 1 36 Proc. of the 10th international conference on World Wide Web, WWW 01, pp. 285 295 (2001). [11] J. Kamahara, T. Asakawa, S. Shimojo and H. Miyahara: A community-based recommendation system to reveal unexpected interests, Proc. of the 11th International Multimedia Modelling Conference, MMM 05, pp. 433 438 (2005). [12], ( 21 ), 2007 (2007). [13] J. M. Kleinberg: Authoritative sources in a hyperlinked environment, J. ACM, 46, pp. 604 632 (1999). [14] R. Lempel and S. Moran: Salsa: the stochastic approach for link-structure analysis, ACM Trans. Inf. Syst., 19, pp. 131 160 (2001). [15] H. Deng, M. R. Lyu and I. King: A generalized co-hits algorithm and its application to bipartite graphs, Proc. of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD 09, ACM, pp. 239 248 (2009). [16] S. Brin and L. Page: The anatomy of a large-scale hypertextual web search engine, Proc. of the seventh international conference on World Wide Web 7, WWW7, pp. 107 117 (1998). [17] G. Erkan and D. R. Radev: Lexrank: graph-based lexical centrality as salience in text summarization, J. Artif. Int. Res., 22, pp. 457 479 (2004). [18] J. Carbonell and J. Goldstein: The use of mmr, diversitybased reranking for reordering documents and producing summaries, Proc. of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR 98, pp. 335 336 (1998).