JAIST Repository: Position Control and Production of Various Strategies for Game of Go Using Deep Learning Methods

トップページ| 北陸先端科学技術大学院大学| 附属図書館

一覧

コミュニティ
& コレクション
タイトル
著者
日付
学位論文
リサーチレポート・テクニカルメモランダム

登録利用者:

登録者ページ
利用者(E-people)

当システムについて

JAIST Repository >
d. 融合科学系 >
d10. 学術雑誌論文等 >
d10-1. 雑誌掲載論文 >

このアイテムの引用には次の識別子を使用してください: http://hdl.handle.net/10119/18243

タイトル:	Position Control and Production of Various Strategies for Game of Go Using Deep Learning Methods
著者:	SHI, YUAN FAN, TIANWEN LI, WANXIANG HSUEH, CHU-HSUAN IKEDA, KOKOLO
キーワード:	computer Go position control various strategies entertainment coaching deep learning AlphaGo Zero
発行日:	2021-03
出版者:	Institute of Information Science and Engineering / The Association for Computational Linguistics and Chinese Language Processing
誌名:	JOURNAL OF INFORMATION SCIENCE AND ENGINEERING
巻:	37
号:	3
開始ページ:	553
終了ページ:	573
DOI:	10.6688/JISE.202105_37(3).0004
抄録:	Computer Go programs have surpassed top-level human players by using deep learning and reinforcement learning techniques. Other than the strength, entertaining Go AI and AI coaches are also interesting directions but have not been well investigated. Some researchers have worked on entertaining beginners or intermediate players. One topic is position control, aiming to make strong programs play close games against weak players. Under such a scenario, the naturalness of the moves is likely to influence weaker players’ enjoyment. Another topic is producing various strategies (or preferences), which human players usually have. Some methods for the two topics have been proposed and evaluated for a traditional Monte-Carlo tree search (MCTS) program. However, there are some critical differences between traditional MCTS programs and recent programs based on AlphaGo Zero, such as LeelaZero and KataGo. For example, recent programs do not run random simulations to the ends of games in MCTS, making the existing method for producing various strategies not applicable. In this paper, we first summarize such differences and some resulted problems. We then adapt existing methods as well as propose new methods to solve the problems, where promising results are obtained. For position control, the modified LeelaZero can play gently against a weaker player (48% of wins against a weaker program, Ray). A human subject experiment shows that the average number of unnatural moves per game is 1.22, while that by a simple method without considering naturalness is 2.29. We also propose a new position control method specifically for endgames. Finally, for producing various strategies, two methods are introduced. In our experiments, center- and edge/corner-oriented strategies are produced by both methods, and human players can successfully identify the strategies.
Rights:	Copyright (C) 2021 Institute of Information Science and Engineering / The Association for Computational Linguistics and Chinese Language Processing. YUAN SHI, TIANWEN FAN, WANXIANG LI, CHU-HSUAN HSUEH and KOKOLO IKEDA, JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2021, 37, 553-573. This paper is published here with permission of the JOURNAL OF INFORMATION SCIENCE AND ENGINEERING.
URI:	http://hdl.handle.net/10119/18243
資料タイプ:	publisher
出現コレクション:	d10-1. 雑誌掲載論文 (Journal Articles)

このアイテムのファイル:

ファイル	記述	サイズ	形式
I-IKEDA-K0405-20.pdf		6725Kb	Adobe PDF	見る/開く

当システムに保管されているアイテムはすべて著作権により保護されています。

お問合せ先 : 北陸先端科学技術大学院大学　研究推進課図書館情報係 (ir-sys[at]ml.jaist.ac.jp)