Monte Carlo Tree Search
Intro
Monte Carlo Tree Search (MCTS) works well in practice but poses theoretical challenges.
In this writing, I want to describe MCTS algorithm, and why this algorithm works.
Open-loop planning algorithms like MCTS, can plan future actions from an initial state $s_0$. They assume access to a model of the environment, either stochastically or ...
Pinsker's Inequality
Theorem (Pinsker's Inequality)
$\forall$ ( P, Q ): probability distributions on measurable space $( U, \Sigma )$,
$\delta(P, Q) \leq \sqrt{\frac{1}{2} D_{\text{KL}}(P \| Q)}$
$\delta(P, Q)$ : Total variation
$D_{\text{KL}}(P \| Q)$ : KL divergence
Proof.
I only prove for discrete case.
A special case of ...
10 post articles, 2 pages.