Skip to content

Commit 4c9a330

Browse files
authored
Merge pull request #88 from samagra14/4e_pseudocode
4e pseudocode
2 parents e5b3f58 + 1bfef67 commit 4c9a330

File tree

7 files changed

+153
-2
lines changed

7 files changed

+153
-2
lines changed

md/Adam-Optimizer.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# ADAM-OPTIMIZER
2+
3+
## AIMA4e
4+
5+
__function__ ADAM-OPTIMIZER(_f_,_L_,***θ***,_ρ_,_α_,_δ_) __returns__ updated ***θ***
6+
&emsp;&emsp;&emsp; /\* _Defualts:_ _&rho;<sub>1</sub>_ = 0.9; _&rho;<sub>2</sub>_ = 0.999;_&alpha;_ = 0.001;_&delta;_ = 10<sup>-8</sup> \*/
7+
&emsp;***s*** &larr; __0__
8+
&emsp;***r*** &larr; __0__
9+
&emsp; t &larr; 0
10+
&emsp;&emsp;&emsp;__while__ ***&theta;*** has not converged
11+
&emsp;&emsp;&emsp;&emsp;&emsp; ***x***, ***y*** &larr; a minibatch of _m_ examples from training set
12+
&emsp;&emsp;&emsp;&emsp;&emsp; ***g*** &larr; <a href="https://www.codecogs.com/eqnedit.php?latex=\inline&space;\frac{1}{m}\nabla_{\theta}\sum_{i}L(f(\textbf{x}^{(i)};\boldsymbol{\theta}),\textbf{y}^{(i)})" target="_blank"><img src="https://latex.codecogs.com/gif.latex?\inline&space;\frac{1}{m}\nabla_{\theta}\sum_{i}L(f(\textbf{x}^{(i)};\boldsymbol{\theta}),\textbf{y}^{(i)})" title="\frac{1}{m}\nabla_{\theta}\sum_{i}L(f(\textbf{x}^{(i)};\boldsymbol{\theta}),\textbf{y}^{(i)})" /></a> /\* compute gradient \*/
13+
&emsp;&emsp;&emsp;&emsp;&emsp;_t_ &larr; _t_ &plus; 1
14+
&emsp;&emsp;&emsp;&emsp;&emsp;***s*** &larr; _&rho;<sub>1</sub>_***s*** &plus; (1 &minus; _&rho;<sub>1</sub>_)***g*** /\* _Update biased first moment estimate_ \*/
15+
&emsp;&emsp;&emsp;&emsp;&emsp;***r*** &larr; _&rho;<sub>2</sub>_***r*** &plus; (1 &minus; _&rho;<sub>2</sub>_)***g***<a href="https://www.codecogs.com/eqnedit.php?latex=\inline&space;\odot" target="_blank"><img src="https://latex.codecogs.com/gif.latex?\inline&space;\odot" title="\odot" /></a>***g*** /\*_Update biased second moment estimate_ \*/
16+
&emsp;&emsp;&emsp;&emsp;&emsp;<a href="https://www.codecogs.com/eqnedit.php?latex=\inline&space;\hat{\textbf{s}}\gets\frac{\textbf{s}}{1-\rho_1^t}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?\inline&space;\hat{\textbf{s}}\gets\frac{\textbf{s}}{1-\rho_1^t}" title="\hat{\textbf{s}}\gets\frac{\textbf{s}}{1-\rho_1^t}" /></a> /\* _Correct bias in first moment_ \*/
17+
&emsp;&emsp;&emsp;&emsp;&emsp;<a href="https://www.codecogs.com/eqnedit.php?latex=\inline&space;\hat{\textbf{r}}\gets\frac{\textbf{r}}{1-\rho_2^t}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?\inline&space;\hat{\textbf{r}}\gets\frac{\textbf{r}}{1-\rho_2^t}" title="\hat{\textbf{r}}\gets\frac{\textbf{r}}{1-\rho_2^t}" /></a> /\* _Correct bias in second moment_ \*/
18+
&emsp;&emsp;&emsp;&emsp;&emsp;<a href="https://www.codecogs.com/eqnedit.php?latex=\inline&space;\boldsymbol{\Delta}\boldsymbol{\theta}&space;=&space;-\epsilon\frac{\hat{\textbf{s}}}{\sqrt{\hat{\textbf{r}}}&plus;\delta}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?\inline&space;\boldsymbol{\Delta}\boldsymbol{\theta}&space;=&space;-\epsilon\frac{\hat{\textbf{s}}}{\sqrt{\hat{\textbf{r}}}&plus;\delta}" title="\boldsymbol{\Delta}\boldsymbol{\theta} = -\epsilon\frac{\hat{\textbf{s}}}{\sqrt{\hat{\textbf{r}}}+\delta}" /></a> /\* _Compute update (operations applied element-wise)_ \*/
19+
&emsp;&emsp;&emsp;&emsp;&emsp;<a href="https://www.codecogs.com/eqnedit.php?latex=\inline&space;\boldsymbol{\theta}&space;\gets&space;\boldsymbol{\theta}&space;&plus;&space;\boldsymbol{\Delta}\boldsymbol{\theta}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?\inline&space;\boldsymbol{\theta}&space;\gets&space;\boldsymbol{\theta}&space;&plus;&space;\boldsymbol{\Delta}\boldsymbol{\theta}" title="\boldsymbol{\theta} \gets \boldsymbol{\theta} + \boldsymbol{\Delta}\boldsymbol{\theta}" /></a> /\* _Apply update_ \*/
20+
21+

md/Cross-Validation-Wrapper.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,5 +20,10 @@ __function__ CROSS\-VALIDATION(_Learner_, _size_, _k_, _examples_) __returns__ t
2020
&emsp;&emsp;&emsp;_fold\_errV_ &larr; _fold\_errV_ &plus; ERROR\-RATE(_h_, _validation\_set_)
2121
&emsp;__return__ _fold\_errT_ &frasl; _k_, _fold\_errV_ &frasl; _k_
2222

23+
---
24+
25+
Figure?? An algorithm to select the model that has the lowest error rate on validation data by building models of increasing complexity, and choosing the one with best empirical error rate on validation data. Here _errT_ means error rate on the training data, and _errV_ means error rate on the validation data. _Learner_(_size_, _exmaples_) returns a hypothesis whose complexity is set by the parameter _size_, and which is trained on the _examples_. PARTITION(_examples_, _fold_, _k_) splits _examples_ into two subsets: a validation set of size _N_ &frasl; _k_ and a training set with all the other examples. The split is different for each value of _fold_.
26+
2327
---
24-
__Figure ??__ An algorithm to select the model that has the lowest error rate on validation data by building models of increasing complexity, and choosing the one with best empirical error rate on validation data. Here _errT_ means error rate on the training data, and _errV_ means error rate on the validation data. _Learner_(_size_, _exmaples_) returns a hypothesis whose complexity is set by the parameter _size_, and which is trained on the _examples_. PARTITION(_examples_, _fold_, _k_) splits _examples_ into two subsets: a validation set of size _N_ &frasl; _k_ and a training set with all the other examples. The split is different for each value of _fold_.
28+
29+
In the fourth edition, cross vaidation wrapper has been renamed to Model-Selection.

md/Gibbs-Ask.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,4 +14,22 @@ __function__ GIBBS-ASK(_X_, __e__, _bn_, _N_) __returns__ an estimate of __P__(_
1414
&emsp;__return__ NORMALIZE(__N__)
1515

1616
---
17-
__Figure__ ?? The Gibbs sampling algorithm for approximate inference in Bayesian networks; this version cycles through the variables, but choosing variables at random also works.
17+
__Figure__ ?? The Gibbs sampling algorithm for approximate inference in Bayesian networks; this version cycles through the variables, but choosing variables at random also works.
18+
19+
---
20+
21+
## AIMA4e
22+
__function__ GIBBS-ASK(_X_, __e__, _bn_, _N_) __returns__ an estimate of __P__(_X_ &vert; __e__)
23+
&emsp;__local variables__: __N__, a vector of counts for each value of _X_, initially zero
24+
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;__Z__, the nonevidence variables in _bn_
25+
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;__x__, the current state of the network, initially copied from __e__
26+
27+
&emsp;initialize __x__ with random values for the variables in __Z__
28+
&emsp;__for__ _j_ = 1 to _N_ __do__
29+
&emsp;&emsp;&emsp;__choose__ any variable _Z<sub>i</sub>_ from __Z__ acoording to any distribution _&rho;(i)_
30+
&emsp;&emsp;&emsp;&emsp;&emsp;set the value of _Z<sub>i</sub>_ in __x__ by sampling from __P__(_Z<sub>i</sub>_ &vert; _mb_(_Z<sub>i</sub>_))
31+
&emsp;&emsp;&emsp;&emsp;&emsp;__N__\[_x_\] &larr; __N__\[_x_\] &plus; 1 where _x_ is the value of _X_ in __x__
32+
&emsp;__return__ NORMALIZE(__N__)
33+
34+
---
35+
__Figure__ ?? The Gibbs sampling algorithm for approximate inference in Bayesian networks; this version cycles through the variables, but choosing variables at random also works.

md/Model-Selection.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# MODEL-SELECTION
2+
3+
## AIMA4e
4+
__function__ MODEL-SELECTION(_Learner_, _examples_, _k_) __returns__ a hypothesis
5+
&emsp;__local variables__: _err_, an array, indexed by _size_, storing validation\-set error rates
6+
&emsp;__for__ _size_ = 1 __to__ &infin; __do__
7+
&emsp;&emsp;&emsp;_err_\[_size_\] &larr; CROSS\-VALIDATION(_Learner_, _size_, _examples_, _k_)
8+
&emsp;&emsp;&emsp;__if__ _err_ is starting to increase significantly __then do__
9+
&emsp;&emsp;&emsp;&emsp;&emsp;_best\_size_ &larr; the value of _size_ with minimum _err_\[_size_\]
10+
&emsp;&emsp;&emsp;&emsp;&emsp;__return__ _Learner_(_best\_size_, _examples_)
11+
12+
---
13+
__function__ CROSS\-VALIDATION(_Learner_, _size_, _examples_, _k_) __returns__ average training set error rate:
14+
&emsp;_errs_ &larr; 0
15+
&emsp;__for__ _fold_ = 1 __to__ _k_ __do__
16+
&emsp;&emsp;&emsp;_training\_set_, _validation\_set_ &larr; PARTITION(_examples_, _fold_, _k_)
17+
&emsp;&emsp;&emsp;_h_ &larr; _Learner_(_size_, _training\_set_)
18+
&emsp;&emsp;&emsp;_errs_ &larr; _errs_ &plus; ERROR\-RATE(_h_, _validation\_set_)
19+
&emsp;__return__ _errs_ &frasl; _k_ // average error rate on validation sets, across k-fold cross-validation
20+
21+
---
22+
23+
Figure?? An algorithm to select the model that has the lowest error rate on validation data by building models of increasing complexity, and choosing the one with best empirical error rate, _err_, on validation data. _Learner_(_size_, _exmaples_) returns a hypothesis whose complexity is set by the parameter _size_, and which is trained on the _examples_. PARTITION(_examples_, _fold_, _k_) splits _examples_ into two subsets: a validation set of size _N_ &frasl; _k_ and a training set with all the other examples. The split is different for each value of _fold_.
24+
25+
---
26+
27+
In the fourth edition, cross vaidation wrapper has been renamed to Model-Selection.
28+

md/Monte-Carlo-Tree-Search.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
2+
3+
4+
# MONTE-CARLO-TREE-SEARCH
5+
6+
## AIMA4e
7+
__function__ MONTE-CARLO-TREE-SEARCH(_state_) __returns__ an action
8+
&emsp;tree &larr; NODE(_state_)
9+
&emsp;__while__ TIME\-REMAINING() __do__
10+
&emsp;&emsp;&emsp;__tree__ &larr; PLAYOUT(_tree_)
11+
&emsp;__return__ the _move_ in ACTIONS(_state_) with highest Q(_state_,_move_)
12+
13+
---
14+
15+
__function__ PLAYOUT(_tree_) __returns__ _updated tree_
16+
&emsp;_node_ &larr; _tree_
17+
&emsp;__while__ _node_ is not terminal and was already in _tree_ __do__
18+
&emsp;&emsp;&emsp;_move_ &larr; SELECT(_node_)
19+
&emsp;&emsp;&emsp;_node_ &larr; FOLLOW\-LINK(_node_,_move_)
20+
&emsp;_outcome_ &larr; SIMULATION(_node_.STATE)
21+
&emsp;UPDATE(_node_,_outcome_)
22+
&emsp;__return__ _tree_
23+
24+
---
25+
26+
__function__ SELECT(_node_) __returns__ _an action_
27+
&emsp;__return__ argmax<sub>m &isin; FEASIBLE\-ACTIONS(_node_)</sub> UCB(RESULT(_node_,_m_))
28+
29+
---
30+
31+
__function__ UCB(_child_) __returns__ _a number_
32+
&emsp;__return__ _child_.VALUE + C &times; <a href="https://www.codecogs.com/eqnedit.php?latex=\inline&space;\sqrt{\frac{\log{child.PARENT.N}}{child.N}}" target="_blank"><img src="https://latex.codecogs.com/png.latex?\inline&space;\sqrt{\frac{\log{child.PARENT.N}}{child.N}}" title="\sqrt{\frac{\log{child.PARENT.N}}{child.N}}" /></a>
33+
34+
35+
---
36+
__FIGURE ??__ The Monte Carlo tree search algorithm. A game tree, _tree_, is initialized, and then grows by one node with each call to PLAYOUT. The function SELECT chooses a move that best balances exploitation and exploration according to the UCB formula. FOLLOW-LINK traverses from the current node by making a move; this could be to a previously-seen node, or to a new node that is added to the tree. Once we have added a new node, we exit the __while__ loop and SIMULATION chooses moves (with a randomized policy that is designed to favor good moves but to compute quickly) until the game is over. Then, UPDATE updates all the nodes in the tree from node to the root, recording the fact that the path led to the final __outcome__.

md/Policy-Iteration.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,24 @@ __function__ POLICY-ITERATION(_mdp_) __returns__ a policy
1818

1919
---
2020
__Figure ??__ The policy iteration algorithm for calculating an optimal policy.
21+
22+
---
23+
24+
## AIMA4e
25+
__function__ POLICY-ITERATION(_mdp_) __returns__ a policy
26+
&emsp;__inputs__: _mdp_, an MDP with states _S_, actions _A_(_s_), transition model _P_(_s&prime;_ &vert; _s_, _a_)
27+
&emsp;__local variables__: _U_, a vector of utilities for states in _S_, initially zero
28+
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;_&pi;_, a policy vector indexed by state, initially random
29+
30+
&emsp;__repeat__
31+
&emsp;&emsp;&emsp;_U_ &larr; POLICY\-EVALUATION(_&pi;_, _U_, _mdp_)
32+
&emsp;&emsp;&emsp;_unchanged?_ &larr; true
33+
&emsp;&emsp;&emsp;__for each__ state _s_ __in__ _S_ __do__
34+
&emsp;&emsp;&emsp;&emsp;&emsp;_a <sup> &#x2a; </sup>_ &larr; argmax<sub>_a_ &isin; _A_(_s_)</sub> Q-VALUE(_mdp_,_s_,_a_,_U_)
35+
&emsp;&emsp;&emsp;&emsp;&emsp;__if__ Q-VALUE(_mdp_,_s_,_a<sup>&#x2a;</sup>_,_U_) &gt; Q-VALUE(_mdp_,_s_,_&pi;_\[_s_\],_U_) __then do__
36+
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;_&pi;_\[_s_\] &larr; _a<sup>&#x2a;</sup>_ ; _unchanged?_ &larr; false
37+
&emsp;__until__ _unchanged?_
38+
&emsp;__return__ _&pi;_
39+
40+
---
41+
__Figure ??__ The policy iteration algorithm for calculating an optimal policy.

md/Value-Iteration.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,25 @@ __function__ VALUE-ITERATION(_mdp_, _&epsi;_) __returns__ a utility function
1818

1919
---
2020
__Figure ??__ The value iteration algorithm for calculating utilities of states. The termination condition is from Equation (__??__).
21+
22+
---
23+
24+
## AIMA4e
25+
__function__ VALUE-ITERATION(_mdp_, _&epsi;_) __returns__ a utility function
26+
&emsp;__inputs__: _mdp_, an MDP with states _S_, actions _A_(_s_), transition model _P_(_s&prime;_ &vert; _s_, _a_),
27+
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;rewards _R_(_s_,_a_,_s&prime;_), discount _&gamma;_
28+
&emsp;&emsp;&emsp;_&epsi;_, the maximum error allowed in the utility of any state
29+
&emsp;__local variables__: _U_, _U&prime;_, vectors of utilities for states in _S_, initially zero
30+
&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;_&delta;_, the maximum change in the utility of any state in an iteration
31+
32+
&emsp;__repeat__
33+
&emsp;&emsp;&emsp;_U_ &larr; _U&prime;_; _&delta;_ &larr; 0
34+
&emsp;&emsp;&emsp;__for each__ state _s_ in _S_ __do__
35+
&emsp;&emsp;&emsp;&emsp;&emsp;_U&prime;_\[_s_\] &larr; max<sub>_a_ &isin; _A_(_s_)</sub> Q-VALUE(_mdp_,_s_,_a_,_U_)
36+
&emsp;&emsp;&emsp;&emsp;&emsp;__if__ &vert; _U&prime;_\[_s_\] &minus; _U_\[_s_\] &vert; &gt; _&delta;_ __then__ _&delta;_ &larr; &vert; _U&prime;_\[_s_\] &minus; _U_\[_s_\] &vert;
37+
&emsp;__until__ _&delta;_ &lt; _&epsi;_(1 &minus; _&gamma;_)&sol;_&gamma;_
38+
&emsp;__return__ _U_
39+
40+
---
41+
__Figure ??__ The value iteration algorithm for calculating utilities of states. The termination condition is from Equation (__??__).
42+
~

0 commit comments

Comments
 (0)