Update evaluate.cpp #3080

FauziAkram · 2020-08-30T11:01:33Z

Passed STC (NNUE=False): https://tests.stockfishchess.org/tests/view/5f42edfe5089a564a10d84a0

LLR: 2.96 (-2.94,2.94) {-0.25,1.25}
Total: 13840 W: 2591 L: 2336 D: 8913
Ptnml(0-2): 194, 1453, 3387, 1676, 210

Passed LTC (NNUE=False): https://tests.stockfishchess.org/tests/view/5f4369795089a564a10d84d8

LLR: 2.95 (-2.94,2.94) {0.25,1.25}
Total: 159744 W: 19430 L: 18850 D: 121464
Ptnml(0-2): 960, 14185, 49030, 14709, 988

Bench: 3506446

vondele · 2020-08-30T12:24:11Z

thanks!

BTW, for pure evaluation patches, testing with Use NNUE=false is correct, and there is no need to do non-regression tests with Use NNUE=true.

FauziAkram · 2020-08-30T12:27:58Z

@vondele I understand, but imagine with me this imaginary scenario:
Imagine an evaluation patch that makes stockfish gain let's say 3 ELO at middle game, and losing 1 ELO at endgame
This patch will pass the NNUE=false tests, but when you merge it to the main stockfish, this will harm it, since main stockfish uses the classical evaulation only in endgame (or when material is low) right?
That is why I like to make sure that my patch is good enough to pass the NNUE=false, and I still like to make sure that it doesn't lose anything in the NNUE=true.
What is your thought on this?

vondele · 2020-08-30T12:54:39Z

I understand the scenario you sketch. However, I want to keep development of classical eval as straightforward as possible, and keep it optimized for classical evaluation.

Furthermore, the impact of classical on use NNUE is small, via hybrid (now mostly for imbalanced psq score, not just endgame or midgame), and most patches to eval will be essentially inconsequential. Yet, our testing will still reject 20% of 0 Elo patches, so we would quite commonly reject a good Elo gainer, just because the impact on use NNUE is 0.

FauziAkram · 2020-08-30T13:00:43Z

I understand and agree with your vision

Passed STC (NNUE=False): https://tests.stockfishchess.org/tests/view/5f42edfe5089a564a10d84a0 LLR: 2.96 (-2.94,2.94) {-0.25,1.25} Total: 13840 W: 2591 L: 2336 D: 8913 Ptnml(0-2): 194, 1453, 3387, 1676, 210 Passed LTC (NNUE=False): https://tests.stockfishchess.org/tests/view/5f4369795089a564a10d84d8 LLR: 2.95 (-2.94,2.94) {0.25,1.25} Total: 159744 W: 19430 L: 18850 D: 121464 Ptnml(0-2): 960, 14185, 49030, 14709, 988 closes official-stockfish/Stockfish#3080 bench: 3736029

Passed STC (NNUE=False): https://tests.stockfishchess.org/tests/view/5f42edfe5089a564a10d84a0 LLR: 2.96 (-2.94,2.94) {-0.25,1.25} Total: 13840 W: 2591 L: 2336 D: 8913 Ptnml(0-2): 194, 1453, 3387, 1676, 210 Passed LTC (NNUE=False): https://tests.stockfishchess.org/tests/view/5f4369795089a564a10d84d8 LLR: 2.95 (-2.94,2.94) {0.25,1.25} Total: 159744 W: 19430 L: 18850 D: 121464 Ptnml(0-2): 960, 14185, 49030, 14709, 988 closes official-stockfish#3080 bench: 3736029

FauziAkram added 3 commits August 30, 2020 13:58

Update evaluate.cpp

e137bd0

Update AUTHORS

a870009

Bench: 3506446

6098dab

Bench: 3506446

vondele added the to be merged Will be merged shortly label Aug 30, 2020

vondele closed this in e0bafa1 Aug 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update evaluate.cpp #3080

Update evaluate.cpp #3080

Uh oh!

FauziAkram commented Aug 30, 2020

Uh oh!

vondele commented Aug 30, 2020

Uh oh!

FauziAkram commented Aug 30, 2020 •

edited

Loading

Uh oh!

vondele commented Aug 30, 2020

Uh oh!

FauziAkram commented Aug 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Update evaluate.cpp #3080

Update evaluate.cpp #3080

Uh oh!

Conversation

FauziAkram commented Aug 30, 2020

Uh oh!

vondele commented Aug 30, 2020

Uh oh!

FauziAkram commented Aug 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vondele commented Aug 30, 2020

Uh oh!

FauziAkram commented Aug 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FauziAkram commented Aug 30, 2020 •

edited

Loading