Skip to content

Commit cb36290

Browse files
committed
adds monte carlo tree search
1 parent e5b3f58 commit cb36290

File tree

1 file changed

+36
-0
lines changed

1 file changed

+36
-0
lines changed

md/Monte-Carlo-Tree-Search.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
2+
3+
4+
# MONTE-CARLO-TREE-SEARCH
5+
6+
## AIMA4e
7+
__function__ MONTE-CARLO-TREE-SEARCH(_state_) __returns__ an action
8+
 tree ← NODE(_state_)
9+
 __while__ TIME\-REMAINING() __do__
10+
   __tree__ ← PLAYOUT(_tree_)
11+
 __return__ the _move_ in ACTIONS(_state_) with highest Q(_state_,_move_)
12+
13+
---
14+
15+
__function__ PLAYOUT(_tree_) __returns__ _updated tree_
16+
 _node_ ← _tree_
17+
 __while__ _node_ is not terminal and was already in _tree_ __do__
18+
   _move_ ← SELECT(_node_)
19+
   _node_ ← FOLLOW\-LINK(_node_,_move_)
20+
 _outcome_ ← SIMULATION(_node_.STATE)
21+
 UPDATE(_node_,_outcome_)
22+
 __return__ _tree_
23+
24+
---
25+
26+
__function__ SELECT(_node_) __returns__ _an action_
27+
&emsp;__return__ argmax<sub>m &isin; FEASIBLE\-ACTIONS(_node_)</sub> UCB(RESULT(_node_,_m_))
28+
29+
---
30+
31+
__function__ UCB(_child_) __returns__ _a number_
32+
&emsp;__return__ _child_.VALUE + C &times;<span class="math">$\sqrt{\frac{\log{_child_.PARENT.N}}{_child_.N}}$</span>
33+
34+
35+
---
36+
__FIGURE ??__ The Monte Carlo tree search algorithm. A game tree, _tree_, is initialized, and then grows by one node with each call to PLAYOUT. The function SELECT chooses a move that best balances exploitation and exploration according to the UCB formula. FOLLOW-LINK traverses from the current node by making a move; this could be to a previously-seen node, or to a new node that is added to the tree. Once we have added a new node, we exit the __while__ loop and SIMULATION chooses moves (with a randomized policy that is designed to favor good moves but to compute quickly) until the game is over. Then, UPDATE updates all the nodes in the tree from node to the root, recording the fact that the path led to the final __outcome__.

0 commit comments

Comments
 (0)