-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Reduce depth after fail high #1768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This is interesting patch. Can we see also what happening at SMP LTC? Some patches dealing with fail highs were failing at SMP LTC, while passing everything else. Behavior at SMP becomes interesting with this patch. Threads which were skipping certain depths, now can search some skips also |
|
So the suggested changes by @locutus2 for completedDepth and lastBestMoveDepth are stronger and included in this pull request. @lp-- Thanks for the comment. I leave it to @locutus2 to decide what he wants to do. |
|
By the way I'd like to try something like this patch to cap the depth reduction at lower depths once this is committed... but anyways, there's much room for experiments, I guess. |
|
@mstembera It's interesting that once a patch passed very quickly doing almost the exact opposite |
Test on top of PR official-stockfish#1768 bench: 3456274
Test on top of PR official-stockfish#1768 bench: 2187615
Test on top of PR official-stockfish#1768 bench: 3384611
Test on top of PR official-stockfish#1768 bench: 2187615
|
@pb00068 Maybe I'm misreading but I don't think it was doing the opposite. It was skipping resolving fail high completely while this resolves it at a reduced depth. |
|
@mstembera @pb00068 Yes, that patch was just skipping fail highs, while this one in addition to skipping faill high at current depth resolve them at lower depth. The problem was was not passing LTC SMP test sprt[0, 5] 7 threads. But that patch was passing STC and LTC faster than this one. So I don't see added value of resolving fail highs at lower depth until this patch is not tested at LTC SMP. |
|
@pb00068 I see you ran variations of skipping fail highs. I think I or someone else tried something like that. Or maybe I tried locally. Current patch instead of increasing thinking time use for resolving fail high at lower depth so maybe increase or decrease is not needed. But it is good to check also what is going with time here. |
bench: 3641758
|
@lp-- Thanks for your comments. I have rebased the patch and am happy to run any additional tests the maintainers request. Alternately feel free to schedule any test you think necessary. |
… out by @locutus2 Rebased bench: 3314347
|
@mstembera I agree that a SMP LTC test could be done in this case... with 4 or 8 threads, they're relatively quick these days. |
|
FYI: I started a SMP test here: http://tests.stockfishchess.org/tests/view/5bc47f860ebc592439f80682 (STC for now) |
|
I think an LTC SMP should also be done |
|
Ok given the SMP test failed I plan on closing this pull request shortly unless someone has a better idea on how to proceed. |
bench: 3314347
|
@mstembera Maybe we can bring it through as improvement for single-thread only I have submitted http://tests.stockfishchess.org/tests/view/5bc857920ebc592439f85765 |
|
But if we do this, aren't we making the single thread tests less relevant in general for the multi-thread performance? I think the reverse is more logical, to focus on improving multi-thread and non regress (or even regress!) on single. Single is only for lists anyway and to look good, every user and tournament uses multi. |
|
I think the main problem for multiple threads is that we have a lower completed depth for the thread that fails high with this patch. Hence it will not be chosen (via the voting) as best thread. An idea to overcome this is needed. |
|
that could be easily tested by keeping completedDepth equal to rootDepth |
|
I agree that multithread nowadays ist more important than singlethread but no reason for throwing away this idea: if it can improve play on singlecore while not hurting smp, then why not commit it? |
|
@ElbertoOne I think best-thread voting plays almost no role if we are just facing fail-high's while our thinking time runs out. Anyhow if LTC passes, a multi-threaded [-3,1] test will show us if there really is a (unsolvable) problem. |
|
@pb00068 My concern is that by making single-thread a special case, we distance its behavior to multi-thread. Hence future single-thread generic testing could be indirectly and unpredictably less related to multi-thread. I might be wrong on this. |
|
@NKONSTANTAKIS Actually I don't make single-thread a special case, I just let do the main thread something different. The main-tread is already today very specialized, for example it's the unique not skipping rootDepth's while iterative deepening. |
|
@pb00068 I will be unavailable for the next few days. Please feel free to open a new pull request if it passes the relevant tests. Good luck! |
This helps resolving consecutive FH's during aspiration more efficiently STC: http://tests.stockfishchess.org/tests/view/5bc857920ebc592439f85765 LLR: 2.95 (-2.94,2.94) [0.00,5.00] Total: 4992 W: 1134 L: 980 D: 2878 Elo +10.72 LTC: http://tests.stockfishchess.org/tests/view/5bc868050ebc592439f857ef LLR: 2.95 (-2.94,2.94) [0.00,5.00] Total: 8123 W: 1363 L: 1210 D: 5550 Elo +6.54 No-Regression test with 8 threads, tc=15+0.15: http://tests.stockfishchess.org/tests/view/5bc874ca0ebc592439f85938 LLR: 2.94 (-2.94,2.94) [-3.00,1.00] Total: 24740 W: 3977 L: 3863 D: 16900 Elo +1.60 This was a cooperation between me and Michael Stembera: -me recognizing SF having problems with resolving FH's efficiently at high depths, thus starting some tests based on consecutive FH's. -mstembera picking up the idea with first success at STC & LTC (so full credits to him!) -me suggesting how to resolve the issues pinpointed by S.G on PR official-stockfish#1768 and finally restricting the logic to the main thread so that it don't regresses at multi-thread. bench: 3314347
This helps resolving consecutive FH's during aspiration more efficiently STC: http://tests.stockfishchess.org/tests/view/5bc857920ebc592439f85765 LLR: 2.95 (-2.94,2.94) [0.00,5.00] Total: 4992 W: 1134 L: 980 D: 2878 Elo +10.72 LTC: http://tests.stockfishchess.org/tests/view/5bc868050ebc592439f857ef LLR: 2.95 (-2.94,2.94) [0.00,5.00] Total: 8123 W: 1363 L: 1210 D: 5550 Elo +6.54 No-Regression test with 8 threads, tc=15+0.15: http://tests.stockfishchess.org/tests/view/5bc874ca0ebc592439f85938 LLR: 2.94 (-2.94,2.94) [-3.00,1.00] Total: 24740 W: 3977 L: 3863 D: 16900 Elo +1.60 This was a cooperation between me and Michael Stembera: -me recognizing SF having problems with resolving FH's efficiently at high depths, thus starting some tests based on consecutive FH's. -mstembera picking up the idea with first success at STC & LTC (so full credits to him!) -me suggesting how to resolve the issues pinpointed by S.G on PR #1768 and finally restricting the logic to the main thread so that it don't regresses at multi-thread. bench: 3314347
STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 26142 W: 5717 L: 5460 D: 14965
http://tests.stockfishchess.org/tests/view/5b9823ab0ebc592cf275aeae
LTC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 72395 W: 11980 L: 11574 D: 48841
http://tests.stockfishchess.org/tests/view/5b98490b0ebc592cf275b2cf
Variant of an original idea by @pb00067. Credit and thanks to him.
@locutus2
I have tried several tests that fix or limit the amount of reduction but the best I got was a green STC and yellow LTC for this http://tests.stockfishchess.org/tests/view/5b9a506b0ebc592cf275d960 version capping the reductions to 4. Running instrumented bench to a depth of 24 revealed that the highest reduction that ever happened was by 9. The number of reductions by amount reduced was as follows:
R1 = 596, R2 = 329, R3=158, R4=78, R5=29, R6=13, R7=6, R8=3, R9=2
Given the data I think the capped version just got a bit less lucky than this original. We could try capping to say 3 or 5 and see what happens if you like.
I have an SMP test scheduled to see if setting completedDepth and lastBestMoveDepth to adjustedDepth is stronger or weaker than to rootDepth.
http://tests.stockfishchess.org/tests/view/5b9ae8a10ebc592cf275e353
If that makes sense to you please approve the test. Otherwise please advise.
Rebased bench: 3314347