Commit dc75bcb
Change SPRT parameters for standard test
Following the discussion here official-stockfish/Stockfish#1859, we aim at
Improve precision, i.e. discard zero or negative patches and accept positive ones with higher margin
Manage resource consumption.
After various simulations, here are results for different sets of SPRT parameters:
| Limits | [0,5] + [0,5] | [0.5, 4.5] + [0,3.5] | [1,4] + [0,3] | [0.5, 4] + [0, 3] | [0.5, 4] + [0, 3.5] | [0.4, 4.4] + [0, 3.2] |
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
| 0 ELO pass prob | 0.0025 | 0.00123 | 0.00036 | 0.0011 | 0.0011 | 0.0014 |
| 1 ELO pass prob | 0.0744 | 0.1002 | 0.0805 | 0.1451 | 0.1183 | 0.1276 |
| +1 ELO rejectance ratio | 0.8201 | 0.7547 | 0.7625 | 0.6718 | 0.7049 | 0.7137 |
| +1.5 ELO rejectance ratio | 0.6310 | 0.5160 | 0.4865 | 0.3968 | 0.4249 | 0.4679 |
| +2 ELO rejectance ratio | 0.3936| 0.2709 | 0.2026 | 0.1700 | 0.1819 | 0.2374 |
| total ELO gain ratio | 1.0 | 1.3192 | 1.2373 | 1.7596 | 1.5664 | 1.5523 |
| -0 ELO acceptance ratio | 2.5 e-04 | 9.1e-05 | 2.2e-0.5 | 1.0 e-04 | 7.77 e-05 | 1.0 e-04 |
| Avg. STC cost | 18431 | 24456 | 34574 | 30936 | 30936 | 25110 |
| Avg. STC + LTC Cost | 27931 | 38039 | 45286 | 50604 | 46093 | 42990 |
From the table above, there is consensus that [0.5, 4.5] + [0, 3.5] pair is the best balance between improved sensitivity and resource consumption.
This patch makes the above parameters the default.
A big thank to all the people involved, particularly to @vondele and @vdbergh for developing and tuning the simulation tool, @Alayan-stk-2, @Chess13234, @xoto10, @Vizvezdenec and @NKONSTANTAKIS for the great and high quality discussion.1 parent 39e9b57 commit dc75bcb
1 file changed
+2
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
67 | | - | |
| 67 | + | |
| 68 | + | |
68 | 69 | | |
69 | 70 | | |
70 | 71 | | |
| |||
0 commit comments