MARC View

000			02440nam a22002657a 4500
003			OSt
005			20260603121952.0
008			260603b \|\|\|\|\|\|\|\| \|\|\|\| 00\| 0 eng d
020			_a9789819739462
041			_aeng
082			_a006.31 ZHA-M
100			_aZhao, Shiyu. _981229
245			_aMathematical foundations of reinforcement learning
260			_aChina: _bJsinghua University Press, _c2025
300			_axvi, 275p.
505			_aFront Matter Pages i-xvi Basic Concepts Shiyu Zhao Pages 1-13 State Values and Bellman Equation Shiyu Zhao Pages 15-34 Optimal State Values and Bellman Optimality Equation Shiyu Zhao Pages 35-55 Value Iteration and Policy Iteration Shiyu Zhao Pages 57-76 Monte Carlo Methods Shiyu Zhao Pages 77-99 Stochastic Approximation Shiyu Zhao Pages 101-124 Temporal-Difference Methods Shiyu Zhao Pages 125-150 Value Function Methods Shiyu Zhao Pages 151-189 Policy Gradient Methods Shiyu Zhao Pages 191-214 Actor-Critic Methods Shiyu Zhao Pages 215-236 Back Matter Pages 237-275
520			_aThis book provides a mathematical yet accessible introduction to the fundamental concepts, core challenges, and classic reinforcement learning algorithms. It aims to help readers understand the theoretical foundations of algorithms, providing insights into their design and functionality. Numerous illustrative examples are included throughout. The mathematical content is carefully structured to ensure readability and approachability. The book is divided into two parts. The first part is on the mathematical foundations of reinforcement learning, covering topics such as the Bellman equation, Bellman optimality equation, and stochastic approximation. The second part explicates reinforcement learning algorithms, including value iteration and policy iteration, Monte Carlo methods, temporal-difference methods, value function methods, policy gradient methods, and actor-critic methods. With its comprehensive scope, the book will appeal to undergraduate and graduate students, post-doctoral researchers, lecturers, industrial researchers, and anyone interested in reinforcement learning.
650			_aArtificial intelligence _981230
650			_aReinforcement learning _981231
650			_aMachine learning. _981232
650			_aMachine learning _xalgorithm _981233
856			_uhttps://doi.org/10.1007/978-981-97-3944-8
942			_cBK
999			_c200032 _d200032