• Huaqing Zhang - Google
  • Ruitong Huang - Borealis AI
  • Shanghang Zhang* - University of California, Berkeley (shzhang.pku[at]


In this chapter, reinforcement learning is analyzed from the perspective of learning and planning. We initially introduce the concepts of model and model-based methods, with the highlight of advantages on model planning. In order to include the benefits of both model-based and model-free methods, we present the integration architecture combining learning and planning, with detailed illustration on Dyna-Q algorithm. Finally, for the integration of learning and planning, the simulation-based search applications are analyzed.

Keywords: model-based, model-free, Dyna, Monte-Carlo Tree Search, Temperal Difference (TD) search


To cite this book, please use this bibtex entry:

 title={Integrating Learning and Planning},
 author={Huaqing Zhang, Ruitong Huang, Shanghang Zhang},
 editor={Hao Dong, Zihan Ding, Shanghang Zhang},
 booktitle={Deep Reinforcement Learning: Fundamentals, Research, and Applications},
 publisher={Springer Nature},

If you find any typos or have suggestions for improving the book, do not hesitate to contact with the corresponding author (name with *).