Skip to content

Conversation

@joergoster
Copy link
Contributor

@joergoster joergoster commented Jan 27, 2020

We can simplify the calculation of the hashfull info by looping over exact 1,000 entries,
and then divide the result by ClusterSize.
The additional effort of looping over 2/3 more entries than current master seems to be compensated by the simpler calculation.

Passed non-regression LTC https://tests.stockfishchess.org/tests/view/5e30079dab2d69d58394fd5d
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 30125 W: 3987 L: 3926 D: 22212
Ptnml(0-2): 177, 2504, 9558, 2642, 141

No functional change.

No functional change.
@joergoster
Copy link
Contributor Author

And another output, this time with 512 MB Hash, to show the growing fill rate.

info depth 20 seldepth 31 multipv 1 score cp 76 lowerbound nodes 3683949 nps 1164700 hashfull 60 tbhits 0 time 3163 pv e2e4
info depth 19 currmove e2e4 currmovenumber 1
info depth 20 seldepth 31 multipv 1 score cp 91 lowerbound nodes 3794504 nps 1164315 hashfull 61 tbhits 0 time 3259 pv e2e4
info depth 18 currmove e2e4 currmovenumber 1
info depth 18 currmove g1f3 currmovenumber 2
info depth 18 currmove d2d4 currmovenumber 3
info depth 18 currmove c2c3 currmovenumber 4
info depth 18 currmove b1c3 currmovenumber 5
info depth 18 currmove a2a3 currmovenumber 6
info depth 18 currmove c2c4 currmovenumber 7
info depth 18 currmove d2d3 currmovenumber 8
info depth 18 currmove h2h3 currmovenumber 9
info depth 18 currmove g2g3 currmovenumber 10
info depth 18 currmove b2b4 currmovenumber 11
info depth 18 currmove f2f3 currmovenumber 12
info depth 18 currmove b1a3 currmovenumber 13
info depth 18 currmove e2e3 currmovenumber 14
info depth 18 currmove f2f4 currmovenumber 15
info depth 18 currmove a2a4 currmovenumber 16
info depth 18 currmove g2g4 currmovenumber 17
info depth 18 currmove g1h3 currmovenumber 18
info depth 18 currmove b2b3 currmovenumber 19
info depth 18 currmove h2h4 currmovenumber 20
info depth 20 seldepth 31 multipv 1 score cp 87 nodes 4104112 nps 1162637 hashfull 67 tbhits 0 time 3530 pv e2e4 e7e6 g1f3 d7d5 e4d5 e6d5 d2d4 g8f6 f1d3 b8c6 e1g1 f8d6 b1c3 e8g8 c1g5 c8e6 c3b5 c6b4 b5d6 c7d6 h2h3 h7h6
...
info depth 25 seldepth 37 multipv 1 score cp 53 nodes 15966032 nps 1161334 hashfull 248 tbhits 0 time 13748 pv d2d4 d7d5 c2c4 e7e6 g1f3 g8f6 c1g5 f8b4 b1c3 e8g8 e2e3 c7c5 c4d5 e6d5 f1e2 c8e6 e1g1 b8d7 a1c1 h7h6 g5h4 b4c3 b2c3 d8a5 d1c2 f8c8 d4c5 c8c5 f3d4 a8c8 c2b2 c5c3
...
info depth 28 seldepth 38 multipv 1 score cp 46 nodes 46316328 nps 1158081 hashfull 621 tbhits 0 time 39994 pv d2d4 e7e6 c2c4 g8f6 g1f3 b7b6 e2e3 d7d5 f1e2 f8d6 b2b3 b8d7 e1g1 f6e4 c1b2 e8g8 b1c3 e4c3 b2c3 d7f6 f3e5 f6e4 c3b2 d8g5 a1c1 c8b7 e2d3 c7c5 c4d5 b7d5
...
info depth 31 seldepth 50 multipv 1 score cp 37 nodes 168466608 nps 1149307 hashfull 995 tbhits 0 time 146581 pv d2d4 e7e6 c2c4 g8f6 b1c3 f8b4 e2e3 e8g8 f1d3 c7c5 g1e2 c5d4 e3d4 d7d5 c4d5 e6d5 e1g1 b8c6 c1g5 b4d6 h2h3 c8e6 f2f4 h7h6 g5h4 d6e7 f4f5 e6d7 h4g3 f8e8 a1c1 a8c8 a2a3 d8b6

@mstembera
Copy link
Contributor

mstembera commented Jan 27, 2020

table[0] is used as often as any other position. The real issue however is that now the calculation will take 3 times as long and access 3 times as much memory.

No functional change.
@joergoster
Copy link
Contributor Author

@mstembera You're right about usage of table[0]. I changed the patch accordingly.

However, looping over 1,000 entries is more straightforward, imho. If this does really matter in terms of speed, then this seems more related to the overly extensive output behavior by Stockfish. I will submit a non-regression LTC test to be on the safe side.

@Rocky640
Copy link

How important is this hashfull info ? Master is only looping the first 333 entries.
I'm not sure why master is using the following code
cnt * 1000 / (ClusterSize * (1000 / ClusterSize));

at first sight a clamp(cnt, 0, 999) would do the same thing and be clearer

@mstembera
Copy link
Contributor

@Rocky640 FYI #1830

@vondele vondele closed this in a910ba7 Jan 28, 2020
@vondele
Copy link
Member

vondele commented Jan 28, 2020

Thanks!

I think in the future non-funtional changes should be tested with LTC bounds but STC time-control.

@mstembera
Copy link
Contributor

The patch only passed because the code doesn't get executed often. However it's 100% obvious from inspecting the code that it's 3x slower. IMO the minor amount of simplification doesn't justify traversing 666 extra clusters for no additional benefit. It goes too far putting too much priority on syntax over function. I would even rather just go back to the old

return cnt * 1000 / 999;

@vondele
Copy link
Member

vondele commented Jan 28, 2020

just it is 3x slower, but the speed doesn't matter as you point. The reported quantity now also has an error that is now roughly 1/sqrt(3) times smaller, which I considered a benefit.

@mstembera
Copy link
Contributor

That's true, however I have never known anyone to care whether the hash is exactly say 50.1% or 50.3% full. 1000 entries is the minimum number necessary to satisfy full resolution which is why Marco chose it.

joergoster added a commit to joergoster/Stockfish that referenced this pull request Sep 11, 2022
We can simplify the calculation of the hashfull info by looping over exact 1,000 entries,
and then divide the result by ClusterSize. Somewhat memory accesses, somewhat more accurate.

Passed non-regression LTC
https://tests.stockfishchess.org/tests/view/5e30079dab2d69d58394fd5d
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 30125 W: 3987 L: 3926 D: 22212
Ptnml(0-2): 177, 2504, 9558, 2642, 141

closes official-stockfish#2523

No functional change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants