1,a) 1,b) 1,c) 2,d) 2,e) word2vec WordNet fine-tuning fine-tuning 1. [10] Faruqui [7] WordNet[15] fine-tuning Retrofitting Retrofitting WordNet[11] 1 2 a) taguchi-y2@asahi.com b) tamori-h@asahi.com c) hitomi-y1@asahi.com d) jiro.nishitoba@retrieva.jp e) ko.kikuta@retrieva.jp Faruqui [7] fine-tuning Sakaizawa [19], [24] Retrofitting 2. 2.1 [10] [3], [6], [13], [14], [17] word2vec WordNet [15] [5], [12], [21] Faruqui [7] WordNet [15] FrameNet [2] PPDB [8] fine-tuninng Retrofitting Retrofitting 1 c 2017 Information Processing Society of Japan 1
3. fine-tuning Faruqui [7] Retrofitting fine-tuning Retrofitting (1) 1 ([7] ) [ V Ψ(Q) = α i q i ˆq i 2 + ] β i,j q i q j 2 (1) i=1 (i,j) E WordNet [9], [18], [22]. 2.2 word2vec *1 [13], [14] Glove *2 [17] Wikipedia word2vec 3 Wikipedia [26] word2vec Wikipedia Bojanowski [4] fasttext Wikipedia [25] 258 [1] word2vec Countinuous Bag-of-Words (CBOW) nwjc2vec [27] word2vec [13], [14] Glove [17] 2 Sakaizawa [19], [24] *1 https://code.google.com/archive/p/word2vec/ *2 https://nlp.stanford.edu/projects/glove/ q i d Q R V d V d q i Q ˆq i ˆQ j : (i, j) E i *3 α β Faruqui [7] (2) j:(i,j) E q i = β i,jq j + α i ˆq i j:(i,j) E β (2) i,j + α i 4. Retrofitting 4.1 Sakaizawa [19], [24] word2vec [13], [14] Glove[17] 4.2 [26] 200 Wikipedia *4 Bojanowski [4] 300 fasttext *5 [25] 200 nwjc2vec *6 [25] word2vec[13], [14] Glove[17] *3 WordNet *4 http://www.cl.ecei.tohoku.ac.jp/~m-suzuki/jawiki_ vector/ *5 https://github.com/facebookresearch/fasttext/ *6 http://nwjc-data.ninjal.ac. jp/ c 2017 Information Processing Society of Japan 2
1 7,982,401(792 ) 86,556,288(8655 ) 2,361,591,403(23 ) 2 word2vec CBOW or skip-gram -cbow {0, 1} -size 300 -window 8 -negative 5 -hs 0 -sample 1e-5 -min-count 3 -iter 15 3 Glove VECTOR SIZE 300 WINDOW SIZE 8 VOCAB MIN COUN 3 MAX ITER 15 1 MeCab-0.996 *7 IPADIC-2.7.0 word2vec *8 CBOW Skip-gram word2vec 2 Glove *9 3 4.3 Retrofitting Faruqui [7] Retrofitting WordNet [15] Word- Net [11] WordNet * 10 * 11 11, 753 WordNet 160, 661 * 12 Retrofitting (2) Faruqui [7] 10 α = 1 β = S 1 (S ) 4.4 Sakaizawa [19], [24] *7 https://taku910.github.io/mecab/ *8 https://code.google.com/archive/p/word2vec/ *9 https://github.com/stanfordnlp/glove *10 http://compling.hss.ntu.edu.sg/wnja/ *11 WordNet ver.1.0 *12 http://compling.hss.ntu.edu.sg/wnja/ Japanese Wordnet and English WordNet in an sqlite3 database 4 1464 158 960 93 1103 411 902 39 ALL 4429 701 * 13 MeCab 2 [24] v N w 1, w 2,, w N v (v1, v2) ( 3) D X Y N = 1 6 D 2 N 3 N (3) CBOW skip-gram Glove Wikipedia fasttext, nwjc2vec 6 4 5 4.5 5 (ALL) Retrofitting Retrofitting Skip-gram Glove fasttext 24 18 WordNet *13 https://github.com/tmu-nlp/ JapaneseWordSimilarityDataset c 2017 Information Processing Society of Japan 3
ALL CBOW( ) 37.7 38.0 27.2 15.0 24.8 CBOW( ) + Retrofitting( ) 36.7(+1.0) 38.0(±0) 32.4(+5.2) 18.9(+3.9) 29.8(+5.0) CBOW( ) + Retrofitting( ) 49.0(+11.3) 54.6(+16.6) 34.6(+7.4) 34.1(+19.1) 33.5(+8.7) Skip-gram( ) 37.9 41.8 32.6 49.6 33.2 Skip-gram( )+ Retrofitting( ) 35.6(+2.3) 42.0(+0.2) 37.1(+4.5) 50.7(+1.1) 39.0(+5.8) Skip-gram( )+ Retrofitting( ) 48.6(+10.7) 58.1(+16.1) 31.8( 0.8) 41.6( 8.0) 37.7(+4.5) Glove( ) 29.0 30.2 32.9 25.4 35.2 Glove( ) + Retrofitting( ) 29.0(±0) 30.2(±0) 37.2(+4.3) 27.2(+1.8) 39.6(+4.4) Glove( ) + Retrofitting( ) 44.7(+15.7) 50.6(+20.4) 39.0(+6.1) 37.3(+21.7) 44.2(+19.0) nwjc2vec[25] 36.0 55.4 32.4 43.4 29.4 nwjc2vec[25] + Retrofitting( ) 34.2( 1.8) 55.5(+0.1) 37.9(+5.5) 47.9(+4.5) 33.7(+4.3) nwjc2vec[25] + Retrofitting( ) 48.3(+12.3) 63.3(+7.9) 35.9(+3.5) 42.4( 1.0) 36.1(+8.7) Wikipedia [26] 35.4 28.5 29.4 47.4 25.3 Wikipedia [26] + Retrofitting( ) 34.0( 1.4) 28.7(+0.2) 35.1(+5.7) 49.0(+1.6) 30.8(+5.5) Wikipedia [26] + Retrofitting( ) 41.4(+5.7) 52.3(+23.7) 33.0(+3.6) 50.9(+3.5) 32.3(+7.0) fasttext[4] -7.4 3.7 22.1 24.6 23.2 fasttext[4] + Retrofitting( ) -7.4(±0) 3.9(+0.2) 28.2(+6.1) 25.4(+0.8) 29.0(+5.8) fasttext[4] + Retrofitting( ) 22.0(+29.4) 42.6(+38.9) 23.4(+1.3) 20.2( 4.4) 31.2(+7.9) 5 100 5 6 Retrofitting Retrofitting 6 Skip-gram Retrofitting( ) Skip-gram 5. WordNet Faruqui [7] Retrofitting WordNet Retrofitting WordNet [23] Bilingal Pivoting * 14 Faruqui [7] PPDB Retrofitting Tamori [20] Retrofitting Nikola [16] Fine-tuning [1] Asahara, M., Maekawa, K., Imada, M., Kato, S. and Konishi, H.: Archiving and Analysing Techniques of the Ultra-large-scale Web-based Corpus Project of NINJAL, Japan, Alexandria, Vol. 25, No. 1-2, pp. 129 148 (2014). [2] Baker, C. F., Fillmore, C. J. and Lowe, J. B.: The berke- *14 https://github.com/tmu-nlp/pmi-ppdb c 2017 Information Processing Society of Japan 4
ley framenet project, Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics-Volume 1, Association for Computational Linguistics, pp. 86 90 (1998). [3] Bengio, Y., Ducharme, R., Vincent, P. and Jauvin, C.: A neural probabilistic language model, Journal of machine learning research, Vol. 3, No. Feb, pp. 1137 1155 (2003). [4] Bojanowski, P., Grave, E., Joulin, A. and Mikolov, T.: Enriching Word Vectors with Subword Information, Transactions of the Association for Computational Linguistics, Vol. 5, pp. 135 146 (2017). [5] Chang, K.-W., Yih, W.-t. and Meek, C.: Multi- Relational Latent Semantic Analysis, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1602 1612 (2013). [6] Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K. and Kuksa, P.: Natural language processing (almost) from scratch, Journal of Machine Learning Research, Vol. 12, No. Aug, pp. 2493 2537 (2011). [7] Faruqui, M., Dodge, J., Jauhar, S. K., Dyer, C., Hovy, E. and Smith, N. A.: Retrofitting Word Vectors to Semantic Lexicons, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, pp. 1606 1615 (2015). [8] Ganitkevitch, J., Van Durme, B. and Callison-Burch, C.: PPDB: The Paraphrase Database., HLT-NAACL, pp. 758 764 (2013). [9] Grouin, C., Hamon, T., Névéol, A. and Zweigenbaum, P.: Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis, Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis (2016). [10] Harris, Z. S.: Distributional structure, Word, Vol. 10, No. 2-3, pp. 146 162 (1954). [11] Isahara, H., Bond, F., Uchimoto, K., Utiyama, M. and Kanzaki, K.: Development of the Japanese WordNet. (2008). [12] Kiela, D., Hill, F. and Clark, S.: Specializing Word Embeddings for Similarity or Relatedness., EMNLP, pp. 2044 2048 (2015). [13] Mikolov, T., Chen, K., Corrado, G. and Dean, J.: Efficient estimation of word representations in vector space, arxiv preprint arxiv:1301.3781 (2013). [14] Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. and Dean, J.: Distributed representations of words and phrases and their compositionality, Advances in neural information processing systems, pp. 3111 3119 (2013). [15] Miller, G. A.: WordNet: a lexical database for English, Communications of the ACM, Vol. 38, No. 11, pp. 39 41 (1995). [16] Mrkšic, N., OSéaghdha, D., Thomson, B., Gašic, M., Rojas-Barahona, L., Su, P.-H., Vandyke, D., Wen, T.-H. and Young, S.: Counter-fitting Word Vectors to Linguistic Constraints, Proceedings of NAACL-HLT, pp. 142 148 (2016). [17] Pennington, J., Socher, R. and Manning, C. D.: Glove: Global vectors for word representation., EMNLP, Vol. 14, pp. 1532 1543 (2014). [18] Roy, A., Park, Y. and Pan, S.: Learning Domain- Specific Word Embeddings from Sparse Cybersecurity Texts, arxiv preprint arxiv:1709.07470 (2017). [19] Sakaizawa, Y. and Komachi, M.: Construction of a Japanese Word Similarity Dataset, arxiv preprint arxiv:1703.05916 (2017). [20] Tamori, H., Hitomi, Y., Okazaki, N. and Inui, K.: Analyzing the Revision Logs of a Japanese Newspaper for Article Quality Assessment, Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism, Association for Computational Linguistics, pp. 46 50 (2017). [21] Yu, M. and Dredze, M.: Improving Lexical Embeddings with Semantic Knowledge., ACL (2), pp. 545 550 (2014). [22] Yu, Z., Wallace, B. C., Johnson, T. and Cohen, T.: Retrofitting Concept Vector Representations of Medical Concepts to Improve Estimates of Semantic Similarity and Relatedness, arxiv preprint arxiv:1709.07357 (2017). [23],, : Bilingual Pivoting, 231, Vol. 2017, No. 21, pp. 1 8 (2017). [24], :, 22 (2016). [25], : nwjc2vec:, 23 (2017). [26],,,, : Wikipedia, 22 (2016). [27] :, 221, Vol. 2015, No. 4, pp. 1 8 (2015). c 2017 Information Processing Society of Japan 5