Simplify Pawn Scoring #2258

protonspring · 2019-07-27T22:56:22Z

This is a functional simplification.

STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 113400 W: 25245 L: 25306 D: 62849
http://tests.stockfishchess.org/tests/view/5d3b866a0ebc5925cf0f3339

LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 33159 W: 5683 L: 5582 D: 21894
http://tests.stockfishchess.org/tests/view/5d3c727d0ebc5925cf0f4810

bench 3951957

Alayan-stk-2 · 2019-07-28T13:18:24Z

This patch probably doesn't lose any significant strength, but it is a good example of why I take issue with [-3;1] as universal simplification bounds no matter if the simplification is massive or minimal.

Here * (r-2)/4 becomes -15 ; the gain in simplicity and maintainability is minimal. Losing 0.5 elo would already be way more than this simplicity is worth, but a -0.5 elo patch has over 25% odds of passing [-3;1].

Analysis done by taking together a bunch of simplification and testing their overall effect usually show that they are neutral and SF doesn't lose elo from them (while gaining in maintainability) ; but this doesn't mean each individual simplification is neutral ; rather the slightly elo-gaining one are usually compensating the losses from the slightly elo-losing ones.

ddugovic · 2019-07-28T13:37:40Z

Without the nested ternary operators, this scope's complexity is revealed:

if (support | phalanx)
{
	int v = Connected[r];
	if (phalanx)
		v *= 3;
	else
		v *= 2;
	if (opposed)
		v = v / 2;
	v += 17 * popcount(support);

	score += make_score(v, v * (r - 2) / 4);
}

protonspring · 2019-07-28T14:22:39Z

@Alayan-stk-2 Would you feel better if -3,1 simplifications were tested twice? Presumably, this would reduce the chance of passing to 6% instead of 25%. Also, a min # of games could also reduce the chance of introducing any regression.

protonspring · 2019-07-28T14:40:31Z

Also, I place FAR more stock in the LTC test. If the LTC does not pass rather quickly (< ~70k games), I don't usually submit simplifications.

Alayan-stk-2 · 2019-07-28T15:16:15Z

@Alayan-stk-2 Would you feel better if -3,1 simplifications were tested twice? Presumably, this would reduce the chance of passing to 6% instead of 25%.

25% is considering two passes already (the 50% odds point is at -0.56 to be exact). So a third [-3, 1] pass would mean 12% still.

Having direct tests at [-2, 2] or whatever bounds best make sense for minor simplifications would make more sense imho, though it leaves the issue of appreciating what falls into "minor" simplifications.

Also, a min # of games could also reduce the chance of introducing any regression.

I don't have a strong opinion on this.

Also, I place FAR more stock in the LTC test.

Yes, I agree, the LTC result is what matters the most. From the STC & LTC, this patch looks ok. But with the error bars, there is uncertainty and after seeing the actual code change I seized up the opportunity to bring up my concerns which are not so much about this single patch than about the process.

snicolet · 2019-07-28T17:26:44Z

Another issue is proposing several simplifications with overlapping areas against the same master: which one are we suppose to prefer?

protonspring · 2019-07-28T17:44:32Z

Here's an idea: add simplifications to a separate branch like trivial fixes. Then merge when 4-5 pass stc/ltc in combination?

snicolet · 2019-07-28T17:52:46Z

Good idea! Another possibility is that the simplificators may want not to rush and wait a little bit before proposing conflicting simplifications. This would put less pressure on the maintainers :-) Here I merged the PR "simplify weak levers", which was a clear, non problematic simplification targeting the pawn evaluation code, so the two other pawn non-obvious "simplifications" (Remove connected array #2259, Simplify Pawn Scoring #2258) should be retested against it, imho.

protonspring · 2019-07-28T18:30:35Z

I will test all of the current ones in combination and report.

protonspring · 2019-07-28T23:24:05Z

Combo Test passed:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 25465 W: 5684 L: 5570 D: 14211
http://tests.stockfishchess.org/tests/view/5d3e17070ebc5925cf0f6a00

protonspring · 2019-07-29T04:16:54Z

If we make Score 64-bit and Values 32-bit, all of the type casting goes away. Also, we probably don't need to worry about overflow with * operator. Is this better?

https://github.com/protonspring/Stockfish/tree/ps_scores8

Alayan-stk-2 · 2019-07-29T10:20:48Z

Making scores 64-bit would lose cache and hash efficiency, wouldn't it ? I think that's the whole point of the current way tapered eval is implemented with an int32 containing both mg and eg values.

If going 64-bit, it could be an interesting experiment to try out 3-parts tapered eval with it... The linear scaling between mg and eg values is obviously a practical limitation in Stockfish's eval, there might be some hidden elo potential with a middle point.

xoto10 · 2019-07-29T13:37:38Z

Yes. I've wondered about changing values and scores before, but a key restriction is we store 16 bit values in the TT (hash). I think values probably need to stay as 16 bit. I wondered if we could shift values up 1 or 2 bits and still keep the 16 bit size. The range between 8k and 32k (roughly 40 to 160 pawns) doesn't seem needed to me. Maybe even ~20 pawns is high enough for mate values? That suggests we could get an extra 2 or 3 bits for more accurate evals, more randomization, search info, or anything else we can think of in the future.

protonspring · 2019-07-29T14:10:04Z

Let's move this discussion to the forum. I will post a new thread.

protonspring · 2019-07-29T20:02:02Z

I've tried a few tests trying to force Values to 16-bit across the board. I get the same bench, but some machines on the framework get different benches. Any ideas on how to figure that out? http://tests.stockfishchess.org/tests/view/5d3f21e90ebc5925cf0f788d http://tests.stockfishchess.org/tests/view/5d3f4a970ebc5925cf0f7a8c

…

On Mon, Jul 29, 2019 at 7:37 AM xoto10 ***@***.***> wrote: Yes. I've wondered about changing values and scores before, but a key restriction is we store 16 bit values in the TT (hash). I think values probably need to stay as 16 bit. I wondered if we could shift values up 1 or 2 bits and still keep the 16 bit size. The range between 8k and 32k (roughly 40 to 160 pawns) doesn't seem needed to me. Maybe even ~20 pawns is high enough for mate values? That suggests we could get an extra 2 or 3 bits for more accurate evals, more randomization, search info, or anything else we can think of in the future. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2258?email_source=notifications&email_token=AHCWOSGOUUYXZ5CQBF6QYCLQB3XC5A5CNFSM4IHK57DKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3AXIEY#issuecomment-515994643>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHCWOSEQ3PHHEG63A7FEW6LQB3XC5ANCNFSM4IHK57DA> .

-- Michael T. Whiteley mobile: 801.707.6886

protonspring · 2019-09-10T18:43:50Z

Just curious if there was any resistance to this one? It's a pretty clean simplification with good performance.

xoto10 · 2019-09-11T09:43:22Z

I would guess Stephane, as far as possible, is not merging patches while TCEC is running. No big deal.

protonspring · 2019-09-26T22:34:16Z

Related code has recently changed, so I'm closing this one.

protonspring added 3 commits July 26, 2019 16:58

-5

38f78b6

-15

cf65823

Merge branch 'master' into ps_connected28

13ca21f

bench 3951957

protonspring referenced this pull request in protonspring/Stockfish Sep 23, 2019

tune more flexible connected pawns.

f0ae60a

protonspring closed this Sep 26, 2019

Simplify Pawn Scoring #2258

Simplify Pawn Scoring #2258

Uh oh!

Conversation

protonspring commented Jul 27, 2019

Uh oh!

Alayan-stk-2 commented Jul 28, 2019

Uh oh!

ddugovic commented Jul 28, 2019

Uh oh!

protonspring commented Jul 28, 2019 via email • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

protonspring commented Jul 28, 2019

Uh oh!

Alayan-stk-2 commented Jul 28, 2019

Uh oh!

snicolet commented Jul 28, 2019

Uh oh!

protonspring commented Jul 28, 2019

Uh oh!

snicolet commented Jul 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

protonspring commented Jul 28, 2019

Uh oh!

protonspring commented Jul 28, 2019

Uh oh!

protonspring commented Jul 29, 2019

Uh oh!

Alayan-stk-2 commented Jul 29, 2019

Uh oh!

xoto10 commented Jul 29, 2019 via email

Uh oh!

protonspring commented Jul 29, 2019

Uh oh!

protonspring commented Jul 29, 2019 via email

Uh oh!

protonspring commented Sep 10, 2019

Uh oh!

xoto10 commented Sep 11, 2019 via email

Uh oh!

protonspring commented Sep 26, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

protonspring commented Jul 28, 2019 via email •

edited

Loading

snicolet commented Jul 28, 2019 •

edited

Loading