JAIST Repository >
School of Transdisciplinary Science >
Articles >
Journal Articles >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10119/18243

Title: Position Control and Production of Various Strategies for Game of Go Using Deep Learning Methods
Authors: SHI, YUAN
Keywords: computer Go
position control
various strategies
deep learning
AlphaGo Zero
Issue Date: 2021-03
Publisher: Institute of Information Science and Engineering / The Association for Computational Linguistics and Chinese Language Processing
Volume: 37
Number: 3
Start page: 553
End page: 573
DOI: 10.6688/JISE.202105_37(3).0004
Abstract: Computer Go programs have surpassed top-level human players by using deep learning and reinforcement learning techniques. Other than the strength, entertaining Go AI and AI coaches are also interesting directions but have not been well investigated. Some researchers have worked on entertaining beginners or intermediate players. One topic is position control, aiming to make strong programs play close games against weak players. Under such a scenario, the naturalness of the moves is likely to influence weaker players’ enjoyment. Another topic is producing various strategies (or preferences), which human players usually have. Some methods for the two topics have been proposed and evaluated for a traditional Monte-Carlo tree search (MCTS) program. However, there are some critical differences between traditional MCTS programs and recent programs based on AlphaGo Zero, such as LeelaZero and KataGo. For example, recent programs do not run random simulations to the ends of games in MCTS, making the existing method for producing various strategies not applicable. In this paper, we first summarize such differences and some resulted problems. We then adapt existing methods as well as propose new methods to solve the problems, where promising results are obtained. For position control, the modified LeelaZero can play gently against a weaker player (48% of wins against a weaker program, Ray). A human subject experiment shows that the average number of unnatural moves per game is 1.22, while that by a simple method without considering naturalness is 2.29. We also propose a new position control method specifically for endgames. Finally, for producing various strategies, two methods are introduced. In our experiments, center- and edge/corner-oriented strategies are produced by both methods, and human players can successfully identify the strategies.
Rights: Copyright (C) 2021 Institute of Information Science and Engineering / The Association for Computational Linguistics and Chinese Language Processing. YUAN SHI, TIANWEN FAN, WANXIANG LI, CHU-HSUAN HSUEH and KOKOLO IKEDA, JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2021, 37, 553-573. This paper is published here with permission of the JOURNAL OF INFORMATION SCIENCE AND ENGINEERING.
URI: http://hdl.handle.net/10119/18243
Material Type: publisher
Appears in Collections:d10-1. 雑誌掲載論文 (Journal Articles)

Files in This Item:

File Description SizeFormat
I-IKEDA-K0405-20.pdf6725KbAdobe PDFView/Open

All items in DSpace are protected by copyright, with all rights reserved.


Contact : Library Information Section, Japan Advanced Institute of Science and Technology