Chap 15. AlphaZero

Authors

Hongming Zhang* - Peking University (zhanghongming[at]pku.edu.cn)
Tianyang Yu - Nanchang University

Abstract

In this chapter, we introduce combinatorial games such as chess and Go, and take Gomoku as an example to introduce the AlphaZero algorithm, a general algorithm that has achieved superhuman performance in many challenging games. This chapter is divided into three parts: the first part introduces the concept of combinatorial games, the second part introduces the family of algorithms known as Monte Carlo Tree Search, and the third part takes Gomoku as the game environment to demonstrate the details of the AlphaZero algorithm, which combines Monte Carlo Tree Search and deep reinforcement learning from self-play.

Keywords: AlphaZero, Monte Carlo Tree Search, Upper Confidence Bounds for Trees, self-play, deep reinforcement learning, deep nerual network

Content

中文版PDF

Code

Codes for contents in this chapter are available here.

Citation

To cite this book, please use this bibtex entry:

@incollection{deepRL-chapter15-2020,
 title={AlphaZero},
 chapter={15},
 author={Hongming Zhang, Tianyang Yu},
 editor={Hao Dong, Zihan Ding, Shanghang Zhang},
 booktitle={Deep Reinforcement Learning: Fundamentals, Research, and Applications},
 publisher={Springer Nature},
 pages={391-416},
 note={\url{http://www.deepreinforcementlearningbook.org}},
 year={2020}
}

If you find any typos or have suggestions for improving the book, do not hesitate to contact with the corresponding author (name with *).