JAIST Repository: Position Control and Production of Various Strategies for Deep Learning Go Programs

トップページ| 北陸先端科学技術大学院大学| 附属図書館

一覧

コミュニティ
& コレクション
タイトル
著者
日付
学位論文
リサーチレポート・テクニカルメモランダム

登録利用者:

登録者ページ
利用者(E-people)

当システムについて

JAIST Repository >
b. 情報科学研究科・情報科学系 >
b11. 会議発表論文・発表資料等 >
b11-1. 会議発表論文・発表資料 >

このアイテムの引用には次の識別子を使用してください: https://hdl.handle.net/10119/18207

タイトル:	Position Control and Production of Various Strategies for Deep Learning Go Programs
著者:	Fan, Tianwen Shi, Yuan Li, Wanxiang Ikeda, Kokolo
キーワード:	Computer Go Position Control Various Strategies Entertainment Coaching Deep Learning AlphaGo Zero
発行日:	2019-11-21
出版者:	Institute of Electrical and Electronics Engineers (IEEE)
誌名:	2019 International Conference on Technologies and Applications of Artificial Intelligence (TAAI 2019), Kaohsiung, Taiwan, China
DOI:	10.1109/TAAI48200.2019.8959895
抄録:	Computer Go programs have exceeded top-level human players by using deep learning and reinforcement learning techniques. On the other hand, “Entertainment Go AI” or “Coaching Go AI” are also interesting directions which have not been well investigated. Several researches have been done for entertaining beginners or intermediate players. Position control or producing various strategies are important tasks, and some methods have been proposed and evaluated using a traditional Monte-Carlo tree search program. In this paper, we try to adapt the method to LeelaZero, a program based on AlphaGo Zero. There are some critical differences between the previous program and the new program. For example the new program does not use random simulations to the ends of games, then the previous method for producing various strategies cannot be used. In this paper we summarized the differences and some expected problems, and proposed several approaches to solve the problems. It was shown that the modified LeelaZero could play gently against weaker players (48% won against a program Ray). Through experiments using human subjects, it was shown that the average number of unnatural moves per game was 1.22, where that by a simple method without considering naturalness was 2.29. Also we evaluated the proposed method for training “center-oriented” and “edge/corner-oriented” players, and it was confirmed that human players could identify the produced strategy (center or edge/corner) with a probability of 71.88%.
Rights:	This is the author's version of the work. Copyright (C) 2019 IEEE. 2019 International Conference on Technologies and Applications of Artificial Intelligence (TAAI 2019). DOI: 10.1109/TAAI48200.2019.8959895. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
URI:	https://hdl.handle.net/10119/18207
資料タイプ:	author
出現コレクション:	b11-1. 会議発表論文・発表資料 (Conference Papers)

このアイテムのファイル:

ファイル	記述	サイズ	形式
I-IKEDA-K-3130.pdf		2937Kb	Adobe PDF	見る/開く

当システムに保管されているアイテムはすべて著作権により保護されています。

お問合せ先 : 北陸先端科学技術大学院大学　研究推進課学術情報係 (ir-sys[at]ml.jaist.ac.jp)