Skip to content

Conversation

@protonspring
Copy link

This is a functional simplification.

STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 113400 W: 25245 L: 25306 D: 62849
http://tests.stockfishchess.org/tests/view/5d3b866a0ebc5925cf0f3339

LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 33159 W: 5683 L: 5582 D: 21894
http://tests.stockfishchess.org/tests/view/5d3c727d0ebc5925cf0f4810

bench 3951957

@Alayan-stk-2
Copy link

This patch probably doesn't lose any significant strength, but it is a good example of why I take issue with [-3;1] as universal simplification bounds no matter if the simplification is massive or minimal.

Here * (r-2)/4 becomes -15 ; the gain in simplicity and maintainability is minimal. Losing 0.5 elo would already be way more than this simplicity is worth, but a -0.5 elo patch has over 25% odds of passing [-3;1].

Analysis done by taking together a bunch of simplification and testing their overall effect usually show that they are neutral and SF doesn't lose elo from them (while gaining in maintainability) ; but this doesn't mean each individual simplification is neutral ; rather the slightly elo-gaining one are usually compensating the losses from the slightly elo-losing ones.

@ddugovic
Copy link

Without the nested ternary operators, this scope's complexity is revealed:

if (support | phalanx)
{
	int v = Connected[r];
	if (phalanx)
		v *= 3;
	else
		v *= 2;
	if (opposed)
		v = v / 2;
	v += 17 * popcount(support);

	score += make_score(v, v * (r - 2) / 4);
}

@protonspring
Copy link
Author

protonspring commented Jul 28, 2019 via email

@protonspring
Copy link
Author

Also, I place FAR more stock in the LTC test. If the LTC does not pass rather quickly (< ~70k games), I don't usually submit simplifications.

@Alayan-stk-2
Copy link

@Alayan-stk-2 Would you feel better if -3,1 simplifications were tested twice? Presumably, this would reduce the chance of passing to 6% instead of 25%.

25% is considering two passes already (the 50% odds point is at -0.56 to be exact). So a third [-3, 1] pass would mean 12% still.

Having direct tests at [-2, 2] or whatever bounds best make sense for minor simplifications would make more sense imho, though it leaves the issue of appreciating what falls into "minor" simplifications.

Also, a min # of games could also reduce the chance of introducing any regression.

I don't have a strong opinion on this.

Also, I place FAR more stock in the LTC test.

Yes, I agree, the LTC result is what matters the most. From the STC & LTC, this patch looks ok. But with the error bars, there is uncertainty and after seeing the actual code change I seized up the opportunity to bring up my concerns which are not so much about this single patch than about the process.

@snicolet
Copy link
Member

Another issue is proposing several simplifications with overlapping areas against the same master: which one are we suppose to prefer?

@protonspring
Copy link
Author

Here's an idea: add simplifications to a separate branch like trivial fixes. Then merge when 4-5 pass stc/ltc in combination?

@snicolet
Copy link
Member

snicolet commented Jul 28, 2019

Good idea! Another possibility is that the simplificators may want not to rush and wait a little bit before proposing conflicting simplifications. This would put less pressure on the maintainers :-) Here I merged the PR "simplify weak levers", which was a clear, non problematic simplification targeting the pawn evaluation code, so the two other pawn non-obvious "simplifications" (Remove connected array #2259, Simplify Pawn Scoring #2258) should be retested against it, imho.

@protonspring
Copy link
Author

I will test all of the current ones in combination and report.

@protonspring
Copy link
Author

Combo Test passed:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 25465 W: 5684 L: 5570 D: 14211
http://tests.stockfishchess.org/tests/view/5d3e17070ebc5925cf0f6a00

@protonspring
Copy link
Author

If we make Score 64-bit and Values 32-bit, all of the type casting goes away. Also, we probably don't need to worry about overflow with * operator. Is this better?

https://github.com/protonspring/Stockfish/tree/ps_scores8

@Alayan-stk-2
Copy link

Making scores 64-bit would lose cache and hash efficiency, wouldn't it ? I think that's the whole point of the current way tapered eval is implemented with an int32 containing both mg and eg values.

If going 64-bit, it could be an interesting experiment to try out 3-parts tapered eval with it... The linear scaling between mg and eg values is obviously a practical limitation in Stockfish's eval, there might be some hidden elo potential with a middle point.

@xoto10
Copy link
Contributor

xoto10 commented Jul 29, 2019 via email

@protonspring
Copy link
Author

Let's move this discussion to the forum. I will post a new thread.

@protonspring
Copy link
Author

protonspring commented Jul 29, 2019 via email

@protonspring
Copy link
Author

Just curious if there was any resistance to this one? It's a pretty clean simplification with good performance.

@xoto10
Copy link
Contributor

xoto10 commented Sep 11, 2019 via email

protonspring referenced this pull request in protonspring/Stockfish Sep 23, 2019
@protonspring
Copy link
Author

Related code has recently changed, so I'm closing this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants