Commit a05cb91
committed
Update default Network to [C5E342AE]
Summary of all training to get to this point:
Step 1: We trained 277DCBF8, using the s44, s51, s52, s53, s54, s55, and s56 data. Trained for 390 epochs (1024x1024x128 steps each), at LR=0.001. Then 60 epochs at LR=0.000025, a 40x drop.
Step 2: We train A258642E, from 277DCBF8, with 100 additional epochs at LR=0.000025
Step 3: We train ECF0F596, by converting A258642E back into the trainer format. This caused us to lose the precision before quantization; the split between the real and factorized weights; and the ADAM optimizer state. We trained for 74 epochs on the same data, but with 100% WDL and 0% Eval, with LR=0.000025.
Step 4: We train DB0BF8E7, from ECF0F596, by adding the s60/s61 data into the mix, and repeating the above process for 71 epochs. This data was accidentally generated with normalization enabled, but since we are doing 100% WDL, this does not matter anymore.
Step 5: We train C5E342AE, from DB0BF8E7. Normally in Adversarial data (s51->s61), we toss out all positions where it is Stockfish's turn, since we can't use the Stockfish evals. Now that we are doing 100% WDL, we can keep those positions. This adds ~4b positions to the data, turning adv. data from ~33% to ~50% of the overall. This was done for 122 epochs, at LR=0.000025.
| | Before | After | Games |
| S44 | 8,384,993,675 | 67.37% | 8,384,993,675 | 50.62% | 100,000,000 |
| S51 | 349,252,596 | 2.81% | 709,156,450 | 4.28% | 10,000,000 |
| S52 | 698,714,372 | 5.61% | 1,418,730,770 | 8.56% | 20,000,000 |
| S53 | 781,050,465 | 6.28% | 1,565,349,298 | 9.45% | 20,000,000 |
| S54 | 365,505,860 | 2.94% | 731,446,934 | 4.42% | 10,000,000 |
| S55 | 146,908,394 | 1.18% | 295,992,833 | 1.79% | 3,983,786 |
| S56 | 769,986,626 | 6.19% | 1,552,959,976 | 9.37% | 20,000,000 |
| S60 | 134,424,304 | 1.08% | 271,047,958 | 1.64% | 3,724,670 |
| S61 | 814,585,699 | 6.55% | 1,635,754,868 | 9.87% | 20,000,000 |
| | 12,445,421,991 | 100.00% | 16,565,432,762 | 100.00% | 207,708,456 |
To the Stockfish developers, who read all of my commits, even though I seldom read theirs when it comes to Torch: This is not useful to you, since you don't generate your own data anymore, and opt to use Leela data. But thanks for the Adversarial data!
Elo | 3.91 +- 3.29 (95%)
SPRT | 10.0+0.10s Threads=1 Hash=8MB
LLR | 2.95 (-2.94, 2.94) [0.00, 3.00]
Games | N: 20536 W: 5059 L: 4828 D: 10649
Penta | [103, 2399, 5070, 2556, 140]
http://chess.grantnet.us/test/36465/
Elo | 5.30 +- 4.04 (95%)
SPRT | 60.0+0.60s Threads=1 Hash=64MB
LLR | 3.00 (-2.94, 2.94) [0.00, 3.00]
Games | N: 13034 W: 3099 L: 2900 D: 7035
Penta | [10, 1439, 3428, 1622, 18]
http://chess.grantnet.us/test/36466/
BENCH: 2,492,1871 parent 15f1e12 commit a05cb91
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| |||
0 commit comments