Skip to content

Commit d862ba4

Browse files
mstemberavondele
authored andcommitted
AVX512, AVX2 and SSSE3 speedups
Improves throughput by summing 2 intermediate dot products using 16 bit addition before upconverting to 32 bit. Potential saturation is detected and the code-path is avoided in this case. The saturation can't happen with the current nets, but nets can be constructed that trigger this check. STC https://tests.stockfishchess.org/tests/view/5fd40a861ac1691201888479 LLR: 2.94 (-2.94,2.94) {-0.25,1.25} Total: 25544 W: 2451 L: 2296 D: 20797 Ptnml(0-2): 92, 1761, 8925, 1888, 106 about 5% speedup closes #3261 No functional change
1 parent d706ae6 commit d862ba4

File tree

1 file changed

+199
-156
lines changed

1 file changed

+199
-156
lines changed

0 commit comments

Comments
 (0)