Topic Estimation for Microblogs Taking into Account the Relationships between Adjacent Tweets

1,a) 2 2 2 LDA 1 1 Topic Estimation for Microblogs Taking into Account the Relationships between Adjacent Tweets Naoya Nakamura 1,a) Ryohei Sasano 2 Hiroya Takamura 2 Manabu Okumura 2 Abstract: In recent years, micloblogs have appeared and become prevalent all over the world. Since the characteristics of the information contained in micloblogs are considered to be different from those contained in conventional media, we can get a lot of useful information from micloblogs. However, commonly used topic models, such as LDA, do not work well on micloblogs, since these topic models do not take into account the relationships between adjacent posts and the shortness of each post. In this paper, we propose a new topic model for micloblogs that takes into account the relationships between adjacent posts. Keywords: Information extraction, Microblog, Topic model 1. 1 1 Twitter Twitter 1 Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology 2 Precision and Intelligence Laboratory, Tokyo Institute of Technology a) nakamura0617@lr.pi.titech.ac.jp [7] Twitter [4] [8] Twitter Twitter LDA Twitter 1

1 2 - - 1 Latent Dirichlet Allocation 1 1 LDA 1 1 Zhao 1 Twitter 1 1 Twitter-LDA [10] 1 Zhao Twitter Twitter-LDA 1 1 2. 2.1 Latent Dirichlet Allocation Blei Latent Dirichlet Allocation (LDA) [1] LDA 1 2.2 Twitter-LDA LDA Twitter 1 LDA Twitter-LDA [10] 2 ( 1 ) 1 ( 2 ) 1 1 (1) (2) 1 1 Twitter-LDA T U φ={φ 1,...,φ T } - θ={θ 1,...,θ U } - z={z 1,...,z Nu } α β φ t θ u θ u φ z s z s s 2.3 Titov Multi-grain LDA (MG-LDA) [9] Gruber Hidden Topic Markov Model (HTMM) 2

[3]Diao Twitter TimeUserLDA [2] 3. 3.1 Twitter Twitter 1 1 2 2 ACL 1 2 1 ( 1 ) ( 2 ) ACL ( 3 ) 1 15 2 ( 1 ) ( 2 ) ACL ( 3 ) 11 1 2 3.2 Twitter-LDA θ u 2 1 θ u Twitter-LDA 2 3 y Twitter-LDA ( 1 ) for each topic t = 1,..., T draw φ t Dir(β) ( 2 ) for each user u = 1,..., U ( a ) draw θ u Dir(α) ( b ) draw z 1 Multinomial(θ u ) ( c ) for each tweet s = 2,..., N u ( i ) draw y s Bernoulli(γ) ( ii ) if y s = 0 then draw z s Multinomial(θ u ) else z s z s 1 ( iii )for each word i = 1,..., N u,s draw w i Multinomial(φ z s ) (2)(b) Twitter-LDA (2)(c) 3.3. LDA d z d T N d T N d d T Twitter-LDA 1 1 3

2 Twitter-LDA 3 1 4 4 n n+1 n+1 2 ( 1 ) M ( 2 ) M p(z i j z i, S i, y i ) p(s ij z i j )p(z i j ) 1 V sij 1 (n( ) k=0 t + Vβ k) v=1 Γ(n (v) t + β) Γ(n (v) t, S i + β) (nt u + α) n (v) t t v n ( ) t t n t u u t S s ij i j 4 5 S i i z i i y i i 5 5 M = 8 4. 4.1 Twitter 600 2 URL 4

1 1 2,388,883 348,517 1,073 topic coherence 120 γ 0.0 0.9 0.1 9 γ = 0.0 Twitter-LDA α = 0.01 β = 0.01 T 250 1 M M = 10 perplexity 1400 1500 1600 1700 1800 1900 2000 0.0 0.2 0.4 0.6 0.8 γ 6 perplexity 4.2 perplexity perplexity ( ) perplexity perplexity = exp 1 log p(s u ) N s u s u N t p(s) p(s) = p(s, z) z = p(z 1 )p(s 1 z 1 ) p(z j z j 1 )p(s j z j ) z j=2 = p(z 1 )p(s 1 z 1 )p(s 2,..., s s z 1 ) z γ perplexity 6 γ perplexity γ = 0.0 Twitter-LDA γ = 0.0 γ = 0.0 γ perplexity γ = 0.8 perplexity perplexity perplexity topic coherence[5], [6] topic coherence topic coherence C(t; V (t) ) = K k 1 k=2 l=1 log D(v(t) k v(t) l ) + 1 D(v (t) l ) D(v (t) ) t v D(u (t), v (t) ) t u v K K t V (t) γ topic coherence 7 γ topic coherence γ = 0.0 γ γ = 0.6 topic coherence γ = 0.6 γ topic coherence Twitter-LDA 5

topic coherence 175 170 165 160 155 0.0 0.2 0.4 0.6 0.8 γ 7 topic coherence 5. 5 Leenders, and Andrew McCallum. Optimizing semantic coherence in topic models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 11, pages 262 272, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. [6] David Newman, Youn Noh, Edmund Talley, Sarvnaz Karimi, and Timothy Baldwin. Evaluating topic models for digital libraries. In Proceedings of the 10th annual joint conference on Digital libraries, JCDL 10, pages 215 224, New York, NY, USA, 2010. ACM. [7] Owen Phelan, Kevin McCarthy, and Barry Smyth. Using twitter to recommend real-time topical news. In Proceedings of the third ACM conference on Recommender systems, Rec- Sys 09, pages 385 388, New York, NY, USA, 2009. ACM. [8] Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In Proceedings of the 19th international conference on World wide web, WWW 10, pages 851 860, New York, NY, USA, 2010. ACM. [9] Ivan Titov and Ryan McDonald. Modeling online reviews with multi-grain topic models. In Proceedings of the 17th international conference on World Wide Web, WWW 08, pages 111 120, New York, NY, USA, 2008. ACM. [10] Wayne Xin Zhao, Jing Jiang, Jianshu Weng, Jing He, Ee- Peng Lim, Hongfei Yan, and Xiaoming Li. Comparing twitter and traditional media using topic models. In Proceedings of the 33rd European conference on Advances in information retrieval, ECIR 11, pages 338 349, Berlin, Heidelberg, 2011. Springer-Verlag. [1] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993 1022, March 2003. [2] Qiming Diao, Jing Jiang, Feida Zhu, and Ee-Peng Lim. Finding bursty topics from microblogs. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 536 544, Jeju Island, Korea, July 2012. Association for Computational Linguistics. [3] Amit Gruber, Michal Rosen-zvi, and Yair Weiss. Hidden topic markov models. In In Proceedings of Artificial Intelligence and Statistics, 2007. [4] Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. What is twitter, a social network or a news media? In Proceedings of the 19th international conference on World wide web, WWW 10, pages 591 600, New York, NY, USA, 2010. ACM. [5] David Mimno, Hanna M. Wallach, Edmund Talley, Miriam 6