Update default net to nn-8a08400ed089.nnue #3474

Sopel97 · 2021-05-18T09:20:47Z

https://tests.stockfishchess.org/tests/view/60a159c65085663412d0921d
TC: 10s+0.1s, 1 thread
ELO: 21.74 +-3.4 (95%) LOS: 100.0%
Total: 10000 W: 1559 L: 934 D: 7507
Ptnml(0-2): 38, 701, 2972, 1176, 113

https://tests.stockfishchess.org/tests/view/60a187005085663412d0925b
TC: 60s+0.6s, 1 thread
ELO: 5.85 +-1.7 (95%) LOS: 100.0%
Total: 20000 W: 1381 L: 1044 D: 17575
Ptnml(0-2): 27, 885, 7864, 1172, 52 

https://tests.stockfishchess.org/tests/view/60a2beede229097940a03806
TC: 20s+0.2s, 8 threads
LLR: 2.93 (-2.94,2.94) <0.50,3.50>
Total: 34272 W: 1610 L: 1452 D: 31210
Ptnml(0-2): 30, 1285, 14350, 1439, 32

https://tests.stockfishchess.org/tests/view/60a2d687e229097940a03c72
TC: 60s+0.6s, 8 threads
LLR: 2.94 (-2.94,2.94) <-2.50,0.50>
Total: 45544 W: 1262 L: 1214 D: 43068
Ptnml(0-2): 12, 1129, 20442, 1177, 12

This network was trained by @vondele using this trainer (the trainer master branch will be updated in the near future), using a combination of data: d8, d9, d10, fishtest_d9.

This network also contains a few architectural changes with respect to the current master:

Size changed from 256x2-32-32-1 to 512x2-16-32-1
- ~15-20% slower
- ~2x larger
- adds a special path for 16 valued ClippedReLU
- fixes affine transform code for 16 inputs/outputs, buy using InputDimensions instead of PaddedInputDimensions
  - this is safe now because the inputs are processed in groups of 4 in the current affine transform code
The feature set changed from HalfKP to HalfKAv2
- Includes information about the kings like HalfKA
- Packs king features better, resulting in 8% size reduction compared to HalfKA
The board is flipped for the black's perspective, instead of rotated like in the current master
PSQT values for each feature
- the feature transformer now outputs a part that is fowarded directly to the output and allows learning piece values more directly than the previous network architecture. The effect is visible for high imbalance positions, where the current master network outputs evaluations skewed towards zero.
- 8 PSQT values per feature, chosen based on (popcount(pos.pieces()) - 1) / 4
- initialized to classical material values on the start of the training
8 subnetworks (512x2->16->32->1), chosen based on (popcount(pos.pieces()) - 1) / 4
- only one subnetwork is evaluated for any position, no or marginal speed loss

Additionally, we experienced a lot high weights in most nets from the pytorch trainer, which causes the slow affine transform path to reduce the performance of the net significantly, therefore it was removed for the time of testing. A revert can be attempted for up to 3% speedup, depending on how many large weights there are.

Diagram of the new architecture:

…tiple of 4 but not a multiple of 32

…e-pytorch/tree/experiment_56

…279_2

…a567be.nnue. Bench: 3517854

This corrects the behavior for some nets, but it's unclear whether it fixes it entirely for all possible weights.

…f70.nnue. Bench: 4137649

vondele · 2021-05-18T09:32:46Z

appveyor fails because of the appveyor script picks up the wrong reference number from the commit message. Just needs to put the Bench on a different line.

…089.nnue. Bench: 3806488

locutus2 · 2021-05-18T09:41:06Z

@Sopel97 @vondele
Thanks for invention and training of this new architecture. Especially for the nice diagram and explanation!

Fanael · 2021-05-18T15:05:42Z

src/nnue/evaluate_nnue.cpp


-    featureTransformer->transform(pos, transformedFeatures);
-    const auto output = network->propagate(transformedFeatures, buffer);
+    const std::size_t bucket = (popcount(pos.pieces()) - 1) / 4;


Nit: pos.count<ALL_PIECES>() is probably faster on most machines than popcount(pos.pieces()).

snicolet · 2021-05-18T15:06:21Z

I'm fine to merge this pull request in, to change the net architecture. Congrats :-)

And if I think I have a simpler/alternative scaling scheme for STC or LTC, we can just resume the normal testing procedure on fishtest.

Concerning the merge, I think it would be best to perhaps keep two commits for the pull request:

first commit with all of Tomasz's changes, up to "Use adds instead of add", adding support for the new architecture, and using the net nn-8a08400ed089.nnue directly
second commit with all the tweaks by Joost to get the Elo, namely the changes in search.cpp, the tuned scaling formula in evaluate.cpp and the removal of random eval.

Fanael · 2021-05-18T15:06:25Z

Great job!

snicolet · 2021-05-18T15:17:24Z

Easiest way to create such two commits is probably to use the diff .. trick, to get the global diff from master to the PR version:
https://github.com/vondele/Stockfish/compare/61e1c66b7c..3dbba284ab

Then in that diff, for first commit use all changes except in evaluate.cpp and search.cpp
For second commit use all remaining changes in evaluate.cpp and search.cpp

vondele · 2021-05-18T15:23:11Z

@snicolet I would prefer to have a single commit as only the combination of the things have been tested on fishtest. However, the changes you refer to are 8b8eb2b and 9f03668 and so can be found easily via this PR.

snicolet · 2021-05-18T15:31:25Z

OK
bye :-)

Introduces a new NNUE network architecture and associated network parameters, as obtained by a new pytorch trainer. The network is already very strong at short TC, without regression at longer TC, and has potential for further improvements. https://tests.stockfishchess.org/tests/view/60a159c65085663412d0921d TC: 10s+0.1s, 1 thread ELO: 21.74 +-3.4 (95%) LOS: 100.0% Total: 10000 W: 1559 L: 934 D: 7507 Ptnml(0-2): 38, 701, 2972, 1176, 113 https://tests.stockfishchess.org/tests/view/60a187005085663412d0925b TC: 60s+0.6s, 1 thread ELO: 5.85 +-1.7 (95%) LOS: 100.0% Total: 20000 W: 1381 L: 1044 D: 17575 Ptnml(0-2): 27, 885, 7864, 1172, 52 https://tests.stockfishchess.org/tests/view/60a2beede229097940a03806 TC: 20s+0.2s, 8 threads LLR: 2.93 (-2.94,2.94) <0.50,3.50> Total: 34272 W: 1610 L: 1452 D: 31210 Ptnml(0-2): 30, 1285, 14350, 1439, 32 https://tests.stockfishchess.org/tests/view/60a2d687e229097940a03c72 TC: 60s+0.6s, 8 threads LLR: 2.94 (-2.94,2.94) <-2.50,0.50> Total: 45544 W: 1262 L: 1214 D: 43068 Ptnml(0-2): 12, 1129, 20442, 1177, 12 The network has been trained (by vondele) using the https://github.com/glinscott/nnue-pytorch/ trainer (started by glinscott), specifically the branch https://github.com/Sopel97/nnue-pytorch/tree/experiment_56. The data used are in 64 billion positions (193GB total) generated and scored with the current master net d8: https://drive.google.com/file/d/1hOOYSDKgOOp38ZmD0N4DV82TOLHzjUiF/view?usp=sharing d9: https://drive.google.com/file/d/1VlhnHL8f-20AXhGkILujnNXHwy9T-MQw/view?usp=sharing d10: https://drive.google.com/file/d/1ZC5upzBYMmMj1gMYCkt6rCxQG0GnO3Kk/view?usp=sharing fishtest_d9: https://drive.google.com/file/d/1GQHt0oNgKaHazwJFTRbXhlCN3FbUedFq/view?usp=sharing This network also contains a few architectural changes with respect to the current master: Size changed from 256x2-32-32-1 to 512x2-16-32-1 ~15-20% slower ~2x larger adds a special path for 16 valued ClippedReLU fixes affine transform code for 16 inputs/outputs, buy using InputDimensions instead of PaddedInputDimensions this is safe now because the inputs are processed in groups of 4 in the current affine transform code The feature set changed from HalfKP to HalfKAv2 Includes information about the kings like HalfKA Packs king features better, resulting in 8% size reduction compared to HalfKA The board is flipped for the black's perspective, instead of rotated like in the current master PSQT values for each feature the feature transformer now outputs a part that is fowarded directly to the output and allows learning piece values more directly than the previous network architecture. The effect is visible for high imbalance positions, where the current master network outputs evaluations skewed towards zero. 8 PSQT values per feature, chosen based on (popcount(pos.pieces()) - 1) / 4 initialized to classical material values on the start of the training 8 subnetworks (512x2->16->32->1), chosen based on (popcount(pos.pieces()) - 1) / 4 only one subnetwork is evaluated for any position, no or marginal speed loss A diagram of the network is available: https://user-images.githubusercontent.com/8037982/118656988-553a1700-b7eb-11eb-82ef-56a11cbebbf2.png closes official-stockfish#3474 Bench: 3806488

Sopel97 and others added 19 commits May 6, 2021 11:06

Replace HalfKP with HalfKAv2

317fa44

512x2-16-32-1

f5fcca9

8 PSQT values per feature and 8 layer stacks.

e289b87

no nnue adjust

ee0d9a8

no random eval

593744e

Add lazy eval by PSQT in the feature transformer.

3525068

SIMD ClippedReLU for 16 values

a6a1a3f

Don't do unnecessary work in the affine transform when input is a mul…

45f00f5

…tiple of 4 but not a multiple of 32

Remove low saturation path

7e51d28

exp 56, run 1, epoch 279. Trained with https://github.com/Sopel97/nnu…

59c6654

…e-pytorch/tree/experiment_56

Test scaled Sopel net. Take 1. Bench: 3418901

8b8eb2b

Adjust some search params. Bench: 3579067

9f03668

Merge remote-tracking branch 'upstream/master' into exp_56_run1_epoch…

2bae04c

…279_2

post-merge fixes

835faa9

Merge branch 'exp_56_run1_epoch279_2' into sopelScale

ef1e7e0

Test net. Exp=run305 (depth=all) Run=run_0 nn-epoch279.nnue nn-ac7637…

c9d73b0

…a567be.nnue. Bench: 3517854

Use adds instead of add.

1e7465e

This corrects the behavior for some nets, but it's unclear whether it fixes it entirely for all possible weights.

Merge branch 'master' into nettest2

f1129c1

Test net. run309/nnue-pytorch/exp/run_0/nn-epoch379.nnue nn-94da15e6d…

5ed5277

…f70.nnue. Bench: 4137649

Test net. run309/nnue-pytorch/exp/run_0/nn-epoch359.nnue nn-8a08400ed…

702295a

…089.nnue. Bench: 3806488

Sopel97 force-pushed the newnet branch from 3dbba28 to 702295a Compare May 18, 2021 09:39

Fanael reviewed May 18, 2021

View reviewed changes

vondele closed this in e8d64af May 18, 2021

snicolet added the to be merged Will be merged shortly label Jun 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update default net to nn-8a08400ed089.nnue #3474

Update default net to nn-8a08400ed089.nnue #3474

Uh oh!

Sopel97 commented May 18, 2021 •

edited

Loading

Uh oh!

vondele commented May 18, 2021

Uh oh!

locutus2 commented May 18, 2021

Uh oh!

Fanael May 18, 2021

Uh oh!

snicolet commented May 18, 2021 •

edited

Loading

Uh oh!

Fanael commented May 18, 2021

Uh oh!

snicolet commented May 18, 2021

Uh oh!

vondele commented May 18, 2021

Uh oh!

snicolet commented May 18, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Update default net to nn-8a08400ed089.nnue #3474

Update default net to nn-8a08400ed089.nnue #3474

Uh oh!

Conversation

Sopel97 commented May 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vondele commented May 18, 2021

Uh oh!

locutus2 commented May 18, 2021

Uh oh!

Fanael May 18, 2021

Choose a reason for hiding this comment

Uh oh!

snicolet commented May 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fanael commented May 18, 2021

Uh oh!

snicolet commented May 18, 2021

Uh oh!

vondele commented May 18, 2021

Uh oh!

snicolet commented May 18, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Sopel97 commented May 18, 2021 •

edited

Loading

snicolet commented May 18, 2021 •

edited

Loading