Skip to content

Summary of book tests #3323

@vdbergh

Description

@vdbergh

The book tests are finished so I am summarizing the results here. From a technical point of view the only column that is important is "Normalized Elo" since it determines the number of games needed to detect the given strength difference (the relation is inversely quadratic, for SPRT the approximate formula is 640000/(n.e.)^2)

For a similar series of test (reported on Jan 3, 2020) see official-stockfish/fishtest#472 (comment) . The issue in loc. cit. also gives more information about the various books. The books themselves can be found in the Fishtest book repository.

Name Draw ratio Bias(Elo) Elo Normalized Elo Test
Drawkiller_balanced_big.epd (LTC) 0.63 70 70.6 [69.0, 72.1] 128.6 [125.6, 131.6] Test
bjbraams_chessdb_198350_lines.epd (LTC) 0.65 122 57.3 [56.0, 58.6] 123.1 [120.0, 126.2] Test
hybrid_book_beta.pgn (LTC) 0.70 66 59.7 [58.3, 61.1] 121.0 [118.2, 123.9] Test
UHO_XXL_+0.80_+1.09.epd (LTC) 0.55 165 57.4 [56.0, 58.8] 119.0 [115.9, 122.1] Test
UHO_XXL_+0.90_+1.19.epd (LTC) 0.49 189 57.6 [56.2, 58.9] 118.0 [114.9, 121.1] Test
8mvs_big_+80_+109.epd (LTC) 0.56 156 56.6 [55.2, 58.0] 116.0 [112.9, 119.1] Test
2moves_v1.pgn (LTC) 0.76 77 47.0 [45.8, 48.2] 110.5 [107.7, 113.3] Test
UHO_XXL_+1.00_+1.29.epd (LTC) 0.43 219 52.9 [51.5, 54.3] 108.7 [105.6, 111.8] Test
closedpos.epd (LTC) 0.79 54 42.0 [40.8, 43.2] 101.7 [99.0, 104.5] Test
noob_3moves.epd (LTC) 0.83 23 37.6 [36.5, 38.7] 96.9 [94.3, 99.5] Test
8moves_v3.pgn (LTC) 0.83 58 32.4 [31.4, 33.4] 87.6 [84.9, 90.2] Test
endgames.epd (LTC) 0.63 181 24.6 [23.6, 25.6] 66.5 [63.8, 69.3] Test

Some STC tests to check scaling.

Name Draw ratio Bias(Elo) Elo Normalized Elo Test
Drawkiller_balanced_big.epd (STC) 0.47 62 86.8 [84.8, 88.8] 130.9 [127.8, 134.0] Test
2moves_v1.pgn (STC) 0.63 77 64.3 [62.8, 65.9] 118.6 [115.6, 121.5] Test
bjbraams_chessdb_198350_lines.epd (STC) 0.57 121 63.8 [62.2, 65.3] 117.7 [114.7, 120.83 Test
UHO_XXL_+0.80_+1.09.epd (STC) 0.50 154 63.0 [61.4, 64.6] 114.8 [111.7, 117.9] Test
noob_3moves.epd (STC) 0.70 32 59.7 [58.2, 61.2] 114.2 [111.7, 117.1] Test
UHO_XXL_+0.90_+1.19.epd (STC) 0.46 171 61.8 [60.3, 63.4] 112.3 [109.2, 115.4] Test
UHO_XXL_+1.00_+1.29.epd (STC) 0.43 193 60.0 [58.5, 61.6] 110.3 [107.2, 113.4] Test
8moves_v3.pgn (STC) 0.71 61 49.0 [47.6, 50.4] 98.8 [95.9, 101.7] Test

EDIT: Added 8mvs_big_+80_+109.epd. This book with 25857 positions is the biggest in the series Unbalanced_Human_Openings_V2.0, created by Stefan Pohl. See https://www.sp-cc.de/unbalanced-human-openings.htm.

EDIT2: Added data for the special purpose books "endgames.epd" and "closedpos.epd".

EDIT3: I started adding STC data to check scaling.

EDIT4: Some more STC data. I did the tests a long time ago but never got around adding the results.

EDIT5: STC data for bjbraams_chessdb_198350_lines.epd added.

EDIT6: LTC data for bjbraams_chessdb_198350_lines.epd added. Strong performance.

EDIT7: Data for the new UHO series by Stefan Pohl.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions