Skip to content

Commit 4e90263

Browse files
committed
Adds Model selection
1 parent ebceece commit 4e90263

File tree

2 files changed

+34
-1
lines changed

2 files changed

+34
-1
lines changed

md/Cross-Validation-Wrapper.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,5 +20,10 @@ __function__ CROSS\-VALIDATION(_Learner_, _size_, _k_, _examples_) __returns__ t
2020
   _fold\_errV_ ← _fold\_errV_ + ERROR\-RATE(_h_, _validation\_set_)
2121
 __return__ _fold\_errT_ ⁄ _k_, _fold\_errV_ ⁄ _k_
2222

23+
---
24+
25+
Figure?? An algorithm to select the model that has the lowest error rate on validation data by building models of increasing complexity, and choosing the one with best empirical error rate on validation data. Here _errT_ means error rate on the training data, and _errV_ means error rate on the validation data. _Learner_(_size_, _exmaples_) returns a hypothesis whose complexity is set by the parameter _size_, and which is trained on the _examples_. PARTITION(_examples_, _fold_, _k_) splits _examples_ into two subsets: a validation set of size _N_ ⁄ _k_ and a training set with all the other examples. The split is different for each value of _fold_.
26+
2327
---
24-
__Figure ??__ An algorithm to select the model that has the lowest error rate on validation data by building models of increasing complexity, and choosing the one with best empirical error rate on validation data. Here _errT_ means error rate on the training data, and _errV_ means error rate on the validation data. _Learner_(_size_, _exmaples_) returns a hypothesis whose complexity is set by the parameter _size_, and which is trained on the _examples_. PARTITION(_examples_, _fold_, _k_) splits _examples_ into two subsets: a validation set of size _N_ ⁄ _k_ and a training set with all the other examples. The split is different for each value of _fold_.
28+
29+
In the fourth edition, cross vaidation wrapper has been renamed to Model-Selection.

md/Model-Selection.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# MODEL-SELECTION
2+
3+
## AIMA4e
4+
__function__ MODEL-SELECTION(_Learner_, _examples_, _k_) __returns__ a hypothesis
5+
 __local variables__: _err_, an array, indexed by _size_, storing validation\-set error rates
6+
 __for__ _size_ = 1 __to__ ∞ __do__
7+
   _err_\[_size_\] ← CROSS\-VALIDATION(_Learner_, _size_, _examples_, _k_)
8+
   __if__ _err_ is starting to increase significantly __then do__
9+
     _best\_size_ ← the value of _size_ with minimum _err_\[_size_\]
10+
     __return__ _Learner_(_best\_size_, _examples_)
11+
12+
---
13+
__function__ CROSS\-VALIDATION(_Learner_, _size_, _examples_, _k_) __returns__ average training set error rate:
14+
 _errs_ ← 0
15+
 __for__ _fold_ = 1 __to__ _k_ __do__
16+
   _training\_set_, _validation\_set_ ← PARTITION(_examples_, _fold_, _k_)
17+
   _h_ ← _Learner_(_size_, _training\_set_)
18+
   _errs_ ← _errs_ + ERROR\-RATE(_h_, _validation\_set_)
19+
 __return__ _errs_ ⁄ _k_ // average error rate on validation sets, across k-fold cross-validation
20+
21+
---
22+
23+
Figure?? An algorithm to select the model that has the lowest error rate on validation data by building models of increasing complexity, and choosing the one with best empirical error rate, _err_, on validation data. _Learner_(_size_, _exmaples_) returns a hypothesis whose complexity is set by the parameter _size_, and which is trained on the _examples_. PARTITION(_examples_, _fold_, _k_) splits _examples_ into two subsets: a validation set of size _N_ ⁄ _k_ and a training set with all the other examples. The split is different for each value of _fold_.
24+
25+
---
26+
27+
In the fourth edition, cross vaidation wrapper has been renamed to Model-Selection.
28+

0 commit comments

Comments
 (0)