Vol. 43 No. 1 Oct. 22 2 Temporal Difference The Study of Evolutionary Change of Shogi Nobusuke Sasaki and Hiroyuki Iida This study explores how the evolutionary changes of the rules affect the characteristics of the games in the Shogi species from the viewpoint of evolutionary selection. For this purpose, we propose two measures based on the average number of possible moves and the average game length: (1) measure for entertainment, and (2) measure for decision complexity. For games where no grandmaster games are available, the statistics of specific features, such as the average number of possible moves and the average game length, are obtained by the method of self-play experiments. Then we made computer programs that performed lookahead search usingan evaluation function based simply on piece-material balance. To obtain the appropriate piece values of Shogi variants, we apply Temporal Difference Learning method. Based on the data obtained by the above self-play experiments, Shogi variants including the modern Shogi and ancient Shogi variants are analyzed and compared. Through the analysis, we observed that the inclusion of the reuse rule was more important step than the inclusion of the major pieces in the course of evolutionary changes from Heian Shogi toward modern Shogi. We believe that the proposed measures are useful tools for building a genealogical tree of chess-like games. 1. search-space complexity Faculty of Management, Hiroshima Prefectural University Department of Computer Science, Shizuoka University D B B D 1) B D 1 2),3) B D B D 2 299
Vol. 43 No. 1 2991 2 2. 2 2.1 1 G b s B (1) 4) b = B 1 s. (1) E(G) E(G) = b (2) D D G E(G) D b D 1 b E(G) D b E(G) E(G) D b D b 2.3 E(G) 2.2 decision complexity 2) AND/OR 1 G D(G) (3) D(G) =b D 2 (3) 1 2.3
2992 Oct. 22 1 Table 1 The relation between the player s skill and the number of plausible moves. s =1 s =2 s =3 35 5.9 3.3 35 1 s 38 6.2 3.4 38 1 s 8 8.9 4.3 8 1 s 1 b s =1, 2, 3 4) s =1 1) De Groot 5) 1 s 2 3 s =2 s =2 (1) (2) (4) B E(G) = (4) D 1) 2 2 2 B/D 3 3 B/D B/D (3) s =2 (5) s s 2 Table 2 The game data of Chess, Xiangqi and Shogi. B D E(G)(= B D ) D(G) 35 8.74 7.6 1 3 38 95.65 3.3 1 37 8 115.78 5.2 1 54 D(G) =B D 4 (5) (5) (4) (5) 2 3. self-play experiment 6) 2 3.1 7)
Vol. 43 No. 1 2993 TD Temporal Difference Learning TD Samuel 8) Sutton 9) TD Beal 1),11) 3.2 TD TD TD TD TD A ν ν(a) = w j x j (A) (6) j j x j (6) ν(a) = w j (N a (A) N b (A)) (7) j N a (A) j N b (A) j P 1 P = (8) 1+e ν TD t w t t w t = α(p t+1 P t ) λ t k w P k (9) k=1 α λ λ 1 λ (6) TD TD αβ 3 3 6 1 1 1, 1, 1 α.5 1.2 3,.2.2 λ.95 3.3 12) (1) (2) 2 8 9 8 9 8 1
2994 Oct. 22 / - + ) Fig. 1 B @ > < : 8 6 4 2 _ ] [ Y Q Y [ ] _ a a a a a a a a a ` ` ` ` ` ` ` ` ` ^ Z X R X Z ^ E F G H I J K L ( *,. 1 9 8 The initial position of 9 8 type Heian Shogi. 1.6 1.4 1.2 1.8.6.4.2 1.4 2 4 6 8 1 () (a) 3 Table 3 Historical Shogi variants compared. 9 8 I 9 9 II 9 9 9 9 8 8 8 8 2 9 8 1 9 8 I II 4 2 13) 3 3.4 3 TD TD 2 2 II 6, II 1, 4 TD YSS 14) TD 1.2 1.8.6.4.2 2 4 6 8 1 Fig. 2 () (b) 2 Learning process of Heian Shogi. 4 1 II 1, 6, Table 4 Normalized learnt values for Shogi variants (1, runs for Intermediate Shogi variant II, and 6, runs for all other variants). I II 1. 1. 1. 1. 1.87 2.98 2.64 6.85 1.47 1.92 2.7 3.45 1.33 2.92 3.56 4.48 1.26 1.41 2.69 4.26 1.76 2.2 4.17 2.75 1.74 3.13 3.78 5.81 1.98 3.17 3. 5.98 2.46 3.9 7.34 6.22 12.89 8.18 16.36 13.8 19.2 11.51 22.25 17.44 YSS 3 3 6 1,
Vol. 43 No. 1 2995 YSS 15) 3.5 TD 1 1, 3 (1) 1 (2) 1357 4 2 (3) 135 3 5 +3 TD 3 1, D B 1 2 1, 3 1 3 1, 5 1 3.6 3 6 2 3 3 4 2 1 3 2 5 3 B/D.6.5.4.3.2.1 1 2 3 4 5 6 7 3 B D Fig. 3 The relation between the check-mate search depth and B D of Shogi historical variants. B/D Fig. 4.1.8.6.4.2 1 2 3 4 5 4 B D The relation between the search depth and Shogi historical variants. B D 2 1 3 3 E(G) 3 3 B D 5 6 2 5 5 3 5 of
2996 Oct. 22 D() 4 35 3 25 2 15 1 5 5 1 2 3 4 5 D Fig. 5 The relation between the search depth and D (the average game length) of Shogi historical variants. B() 8 7 6 5 4 3 2 1 6 1 2 3 4 5 B Fig. 6 The relation between the search depth and B (the average number of possible moves) of Shogi historical variants. 5 5 Table 5 The comparison of Shogi variants (Search depth is 5). B D E(G) D(G) 2 245.18 4.9 1 79 I 27 275.19 2.5 1 98 II 58 154.5 7.8 1 67 61 11.71 1.2 1 49 5 3 6 5 I II E(G)(= B/D) E(G) II 3 6 5 E(G) E(G) I D(G) E(G) D(G) D(G) D(G) 4. 2 2
Vol. 43 No. 1 2997 16) 1 1 #1287859 1) Matsubara, H., Iida, H. and Grimbergen, R.: Natural Developments in Game Research., ICCA Journal, Vol.19, No.2, pp.13 111 (1996). 2) Allis, L.V., Herschberg, I.S. and van den Herik, H.J.: Which Games Will Survive?, Heuristic Programming in Artificial Intelligence 2: The Second Computer Olympiad, Levy, D.N.L. and Beal, D.F. (Eds.), pp.232 243, Ellis Horwood, Chichester (1991). 3) van den Herik, H.J., Uiterwijk, J.W. and van Rijswijck., J.: Games solved: Now and in the future, Artificial Intelligence, Vol.134, No.1-2, pp.121 144 (22). 4) Vol.99, No.53, pp.91 98 (1999). 5) Groot, A.D.: Thought and Choice in Chess, Mouton, The Hague (1965). 6) Heinz, E.A.: Self-play, Deep Search and Diminishing Returns, ICGA Journal, Vol.24, No.2, pp.75 79 (21). 7) Newborn, M.: A Hypothesis Concerning The Strength of Chess Programs, ICCA Journal, Vol.8, No.4, pp.29 215 (1985). 8) Samuel, A.L.: Some Studies in Machine Learning Using the Game of Checkers, IBM Journal of Research and Development, Vol.3, pp.21 229 (1959). 9) Sutton, R.: Learning to Predict by the Methods of Temporal Differences, Machine Learning, Vol.3, pp.9 44 (1988). 1) Beal, D.F. and Smith, M.: Learning Piece Values Using Temporal Differences, ICCA Journal, Vol.2, No.3, pp.147 151 (1997). 11) Beal, D.F. and Smith, M.: First Results from Using Temporal Difference Learning in Shogi, Computers and Games, Lecture Notes in Computer Science, Vol.LNCS 1558, pp.113 125, Springer-Verlag (1999). 12) Vol.11, pp.1 9 (1999). 13) (1977). 14) YSS 2 pp.112 142 (1998). 15) IPSJ Symposium Series, Vol.21, No.14, pp.14 147 (21). 16) Kraaijeveld, A.R.: Origin of chess A phylogenetic perspective, Board Games Studies, Vol.3, pp.39 49 (2). ( 14 2 25 ) ( 14 9 5 ) 1971 1998 2 1962 1994