trainingdata-tool

Tool to generate lc0 training data. Useful for Supervised Learning from PGN games.

It has been used to train leela networks that mimic human playing styles and it can also be used distill AB engines (and its various personalities) into to a leela network. Experiment and have fun with it!

Compilation Instructions

For Linux

Run ./build-linux.sh

For Mac

Run ./build-mac.sh

Releases

For Windows, you may use this appveyor artifact.

For Ubuntu (20.04), sign up to CircleCI. Once logged in, you will find an artifact in this project folder.

Changes compared to the Original

Generate policy data from move variations and softmax
Set D values using this data
Updated the centipawn to Q formula. See here for reference.

How do I annotate a pgn with Q and policy data?

To annotate a pgn with Q and policy, you can add evaluations from any chess engine using pgn-extract and uci-analyser.

With uci-analyser, you can generate evals for both the mainline and variations. trainingdata-tool then takes the mainline evaluation and converts it into Q. Then it transforms evals from the mainline and variations into a policy distribution using softmax.

However, uci-analyser is only compatible with stockfish. So if you want to distill other lc0 weights and other AB engines, you will likely have to modify analyse.cpp in uci-analyser to make it compatible. An lc0 compatible pgn writer is lichess-artist but it's an order of magnitude slower than that of uci-analyser.

Usage - No Q Data

Pass the PGN input file and it will output training data in the same way lc0 selfplay does. Example:

trainingdata-tool Kasparov.pgn -max-games-to-convert 1000000 -files-per-dir 100000 -chunks-per-file 135

Usage - Q and Policy data

To train with your generated Q and/or policy data, add the -lichess-mode flag. You'll need to use a pgn writer that uses the lichess format.

Below is an example of a pgn (clipped after the 5th move for space) that uses the lichess format.

[Event "?"]
[Site "?"]
[Date "2019.09.24"]
[Round "1"]
[White "sf10_1"]
[Black "sf10_2"]
[Result "1-0"]
[ECO "A05"]
[Opening "Reti Opening"]
[TimeControl "inf"]
[PlyCount "129"]
[GameDuration "00:00:03"]
[GameEndTime "2019-09-24T13:09:53.478 CDT"]
[GameStartTime "2019-09-24T13:09:49.780 CDT"]

1. Nf3 { [%eval 0.34] } (1. e4 { [%eval 0.34] }) (1. Nc3 { [%eval 0.17] }) (1. d4 { [%eval 0.17] }) (1. e3 { [%eval 0.09] }) (1. a3 { [%eval -0.18] }) 1... Nf6 { [%eval -0.54] } (1... e5 { [%eval -0.54] }) (1... d5 { [%eval -0.67] }) (1... Nc6 { [%eval -0.67] }) (1... e6 { [%eval -0.90] }) (1... a6 { [%eval -1.00] }) 2. c4 { [%eval 0.78] } (2. e4 { [%eval 0.78] }) (2. d4 { [%eval 0.32] }) (2. Nc3 { [%eval 0.12] }) (2. g4 { [%eval 0.02] }) (2. a4 { [%eval -0.55] }) 2... g6 { [%eval 0.37] } (2... e5 { [%eval 0.37] }) (2... Nc6 { [%eval -0.01] }) (2... d5 { [%eval -0.15] }) (2... e6 { [%eval -0.18] }) (2... d6 { [%eval -0.41] }) (2... h6 { [%eval -0.48] }) 3. Nc3 { [%eval 0.70] } (3. e4 { [%eval 0.70] }) (3. d4 { [%eval 0.32] }) (3. a3 { [%eval 0.09] }) (3. d3 { [%eval 0.05] }) (3. g3 { [%eval -0.01] }) 3... d5 { [%eval -0.63] } (3... Bg7 { [%eval -0.63] }) (3... d6 { [%eval -0.63] }) (3... e5 { [%eval -0.69] }) (3... c5 { [%eval -1.01] }) (3... c6 { [%eval -1.30] }) 4. Qa4+ { [%eval 0.88] } (4. cxd5 { [%eval 0.88] }) (4. d4 { [%eval 0.59] }) (4. e3 { [%eval 0.29] }) (4. Nxd5 { [%eval -0.03] }) (4. d3 { [%eval -0.27] }) (4. c5 { [%eval -0.43] }) 4... c6 { [%eval 0.35] } (4... Bd7 { [%eval 0.35] }) (4... Qd7 { [%eval -0.48] }) (4... Nc6 { [%eval -0.60] }) (4... b5 { [%eval -0.82] }) (4... Nbd7 { [%eval -1.25] }) 5. cxd5 { [%eval 0.48] } (5. Nxd5 { [%eval -0.09] }) (5. d4 { [%eval -0.12] }) (5. d3 { [%eval -0.29] }) (5. c5 { [%eval -0.63] }) (5. e3 { [%eval -0.67] }) 5... b5 { [%eval -0.32] } (5... Bg7 { [%eval -0.60] }) (5... Bd7 { [%eval -0.74] }) (5... a6 { [%eval -0.88] }) (5... Nxd5 { [%eval -1.01] }) (5... Qb6 { [%eval -1.03] })

Under verbose mode, this is how the output will look like with variations:

Read move: Ne4
Ne4 main line comment:  [%eval 0.63]
 |-- Read variation: Kg2
    |-- Kg2 variation comment:  [%eval 0.00]
 |-- Read variation: Bxh6
    |-- Bxh6 variation comment:  [%eval -0.26]
 |-- Read variation: e3
    |-- e3 variation comment:  [%eval -0.36]
 |-- Read variation: e4
    |-- e4 variation comment:  [%eval -0.51]
 |-- Read variation: a4
    |-- a4 variation comment:  [%eval -0.62]
Write chunk: [Move: c3e4, Result: 1-0, Root Q: 0.328718, Best Q: 0.328718, Root D: 0.503461, Best D: 0.75]

If you don't have variations, it also works fine:

Read move: Ka7
Ka7 main line comment:  [%eval -15.07]
Write chunk: [Move: b1a2, Result: 1-0, Root Q: -0.958218, Best Q: -0.958218, Root D: 0.0271585, Best D: 0.0271585]

Usage - Other options

There are 4 options:

-v: Verbose mode
-lichess-mode: Lichess mode. Will extract SF evaluation score from Lichess commented games. Non-commented games will be filtered out.
-files-per-dir <integer number>: Max games to store in a single directory, when that number is reached a new directory is created to store the new games to avoid stressing the file system too much.
-max-files-to-convert <integer number>: Stop after this many files have been written.
-chunks-per-file: How many training data chunks (positions) to write in each file (game). There are, on average, 135 positions per game so I suggest setting 135.

Possible Future Updates

Update from v4 training data to v5 (MLH head).

Special Thanks

Leela Chess Zero
DeepMind Alpha Zero
Daniel Uranga and the original contributors of trainingdata-tool

Name		Name	Last commit message	Last commit date
Latest commit History 148 Commits
.circleci		.circleci
lc0 @ 5532675		lc0 @ 5532675
polyglot @ 6e8af49		polyglot @ 6e8af49
src		src
test		test
zlib @ cacf7f1		zlib @ cacf7f1
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
README.md		README.md
appveyor.yml		appveyor.yml
build-linux.sh		build-linux.sh
build-mac.sh		build-mac.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

trainingdata-tool

Compilation Instructions

For Linux

For Mac

Releases

Changes compared to the Original

How do I annotate a pgn with Q and policy data?

Usage - No Q Data

Usage - Q and Policy data

Usage - Other options

Possible Future Updates

Special Thanks

About

Uh oh!

Releases

Packages

Languages

Devil-commits/trainingdata-tool

Folders and files

Latest commit

History

Repository files navigation

trainingdata-tool

Compilation Instructions

For Linux

For Mac

Releases

Changes compared to the Original

How do I annotate a pgn with Q and policy data?

Usage - No Q Data

Usage - Q and Policy data

Usage - Other options

Possible Future Updates

Special Thanks

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages