-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
The book tests are finished so I am summarizing the results here. From a technical point of view the only column that is important is "Normalized Elo" since it determines the number of games needed to detect the given strength difference (the relation is inversely quadratic, for SPRT the approximate formula is 640000/(n.e.)^2)
For a similar series of test (reported on Jan 3, 2020) see official-stockfish/fishtest#472 (comment) . The issue in loc. cit. also gives more information about the various books. The books themselves can be found in the Fishtest book repository.
| Name | Draw ratio | Bias(Elo) | Elo | Normalized Elo | Test |
|---|---|---|---|---|---|
| Drawkiller_balanced_big.epd (LTC) | 0.63 | 70 | 70.6 [69.0, 72.1] | 128.6 [125.6, 131.6] | Test |
| bjbraams_chessdb_198350_lines.epd (LTC) | 0.65 | 122 | 57.3 [56.0, 58.6] | 123.1 [120.0, 126.2] | Test |
| hybrid_book_beta.pgn (LTC) | 0.70 | 66 | 59.7 [58.3, 61.1] | 121.0 [118.2, 123.9] | Test |
| UHO_XXL_+0.80_+1.09.epd (LTC) | 0.55 | 165 | 57.4 [56.0, 58.8] | 119.0 [115.9, 122.1] | Test |
| UHO_XXL_+0.90_+1.19.epd (LTC) | 0.49 | 189 | 57.6 [56.2, 58.9] | 118.0 [114.9, 121.1] | Test |
| 8mvs_big_+80_+109.epd (LTC) | 0.56 | 156 | 56.6 [55.2, 58.0] | 116.0 [112.9, 119.1] | Test |
| 2moves_v1.pgn (LTC) | 0.76 | 77 | 47.0 [45.8, 48.2] | 110.5 [107.7, 113.3] | Test |
| UHO_XXL_+1.00_+1.29.epd (LTC) | 0.43 | 219 | 52.9 [51.5, 54.3] | 108.7 [105.6, 111.8] | Test |
| closedpos.epd (LTC) | 0.79 | 54 | 42.0 [40.8, 43.2] | 101.7 [99.0, 104.5] | Test |
| noob_3moves.epd (LTC) | 0.83 | 23 | 37.6 [36.5, 38.7] | 96.9 [94.3, 99.5] | Test |
| 8moves_v3.pgn (LTC) | 0.83 | 58 | 32.4 [31.4, 33.4] | 87.6 [84.9, 90.2] | Test |
| endgames.epd (LTC) | 0.63 | 181 | 24.6 [23.6, 25.6] | 66.5 [63.8, 69.3] | Test |
Some STC tests to check scaling.
| Name | Draw ratio | Bias(Elo) | Elo | Normalized Elo | Test |
|---|---|---|---|---|---|
| Drawkiller_balanced_big.epd (STC) | 0.47 | 62 | 86.8 [84.8, 88.8] | 130.9 [127.8, 134.0] | Test |
| 2moves_v1.pgn (STC) | 0.63 | 77 | 64.3 [62.8, 65.9] | 118.6 [115.6, 121.5] | Test |
| bjbraams_chessdb_198350_lines.epd (STC) | 0.57 | 121 | 63.8 [62.2, 65.3] | 117.7 [114.7, 120.83 | Test |
| UHO_XXL_+0.80_+1.09.epd (STC) | 0.50 | 154 | 63.0 [61.4, 64.6] | 114.8 [111.7, 117.9] | Test |
| noob_3moves.epd (STC) | 0.70 | 32 | 59.7 [58.2, 61.2] | 114.2 [111.7, 117.1] | Test |
| UHO_XXL_+0.90_+1.19.epd (STC) | 0.46 | 171 | 61.8 [60.3, 63.4] | 112.3 [109.2, 115.4] | Test |
| UHO_XXL_+1.00_+1.29.epd (STC) | 0.43 | 193 | 60.0 [58.5, 61.6] | 110.3 [107.2, 113.4] | Test |
| 8moves_v3.pgn (STC) | 0.71 | 61 | 49.0 [47.6, 50.4] | 98.8 [95.9, 101.7] | Test |
EDIT: Added 8mvs_big_+80_+109.epd. This book with 25857 positions is the biggest in the series Unbalanced_Human_Openings_V2.0, created by Stefan Pohl. See https://www.sp-cc.de/unbalanced-human-openings.htm.
EDIT2: Added data for the special purpose books "endgames.epd" and "closedpos.epd".
EDIT3: I started adding STC data to check scaling.
EDIT4: Some more STC data. I did the tests a long time ago but never got around adding the results.
EDIT5: STC data for bjbraams_chessdb_198350_lines.epd added.
EDIT6: LTC data for bjbraams_chessdb_198350_lines.epd added. Strong performance.
EDIT7: Data for the new UHO series by Stefan Pohl.