Skip to content

Commit 23c385e

Browse files
MaximMolchanovvondele
authored andcommitted
Affine transform refactoring.
Reordered weights in such a way that accumulated sum fits to output. Weights are grouped in blocks of four elements because four int8 (weight type) corresponds to one int32 (output type). No horizontal additions. Grouped AVX512, AVX2 and SSSE3 implementations. Repeated code was removed. An earlier version passed STC: LLR: 2.97 (-2.94,2.94) {-0.25,1.25} Total: 15336 W: 1495 L: 1355 D: 12486 Ptnml(0-2): 44, 1054, 5350, 1158, 62 https://tests.stockfishchess.org/tests/view/5ff60e106019e097de3eefd5 Speedup depends on the architecture, up to 4% measured on a NNUE only bench. closes #3287 No functional change
1 parent d21e421 commit 23c385e

File tree

1 file changed

+137
-496
lines changed

1 file changed

+137
-496
lines changed

0 commit comments

Comments
 (0)