-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Tweak histories update with several conditions. #6040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tweak histories update with several conditions. #6040
Conversation
Scale history bonus and malus dependent on 9 conditions. The factors are tuned first at STC and on top of this at LTC (here the values after half of tuning were used): - STC SPSA: https://tests.stockfishchess.org/tests/view/680e2e4b3629b02d74b15d6d - LTC SPSA: https://tests.stockfishchess.org/tests/view/680e94a43629b02d74b15e70 Passed STC: LLR: 2.93 (-2.94,2.94) <0.00,2.00> Total: 27776 W: 7352 L: 7054 D: 13370 Ptnml(0-2): 68, 3126, 7221, 3386, 87 https://tests.stockfishchess.org/tests/view/680ec0f83629b02d74b1605b Passed LTC: LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 161304 W: 41432 L: 40864 D: 79008 Ptnml(0-2): 61, 17305, 45357, 17863, 66 https://tests.stockfishchess.org/tests/view/680ec7f93629b02d74b16084 Bench: 2067412
9bbfe6e to
8d718ab
Compare
|
Could this be verified at VVLTC? There had been prior cases where history tunes significantly altered VVLTC scaling (e.g. #5907). The Elo trend from STC to LTC doesn't look great either. |
|
@xu-shawn |
|
@xu-shawn |
|
I close this PR because the VVLTC non-regression test failed https://tests.stockfishchess.org/tests/view/6810d0533629b02d74b16756 |
It has been repeatedly shown that such tunes are suspectible to become anti-scaling. Below are some recent examples: official-stockfish@2e91a86 official-stockfish@d11f49b Passed STC: LLR: 2.95 (-2.94,2.94) <0.00,2.00> Total: 32448 W: 8651 L: 8342 D: 15455 Ptnml(0-2): 81, 3695, 8408, 3914, 126 https://tests.stockfishchess.org/tests/view/6899489b0049e8ccef9d64ad Passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,2.50> Total: 73854 W: 19042 L: 18649 D: 36163 Ptnml(0-2): 37, 7908, 20659, 8271, 52 https://tests.stockfishchess.org/tests/view/689abbe7fd8719b088c8d514 Revert VVLTC with STC bound: LLR: 2.95 (-2.94,2.94) <0.00,2.00> Total: 53802 W: 14030 L: 13740 D: 26032 Ptnml(0-2): 5, 4924, 16754, 5212, 6 https://tests.stockfishchess.org/tests/view/68a9a9f575da51a345a5a675 Revert VVLTC with LTC bound: LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 53658 W: 14022 L: 13699 D: 25937 Ptnml(0-2): 3, 4894, 16712, 5217, 3 https://tests.stockfishchess.org/tests/view/68a8d2b2b6fb3300203bca77 https://tests.stockfishchess.org/tests/view/688cf38bf17748b4d23c8057 https://tests.stockfishchess.org/tests/view/6890bc7792fcad741b804a19 Passed LTC: LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 74928 W: 19466 L: 19071 D: 36391 Ptnml(0-2): 37, 8048, 20901, 8439, 39 Failed Non-regression VLTC: LLR: -2.94 (-2.94,2.94) <-1.75,0.25> Total: 57704 W: 14643 L: 14928 D: 28133 Ptnml(0-2): 5, 5925, 17280, 5634, 8 https://tests.stockfishchess.org/tests/view/6890bc7792fcad741b804a19 (Note that an STC-tuned version passed non-regression, but was shortly simplified) official-stockfish#6040 Passed STC: LLR: 2.93 (-2.94,2.94) <0.00,2.00> Total: 27776 W: 7352 L: 7054 D: 13370 Ptnml(0-2): 68, 3126, 7221, 3386, 87 https://tests.stockfishchess.org/tests/view/680ec0f83629b02d74b1605b Passed LTC: LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 161304 W: 41432 L: 40864 D: 79008 Ptnml(0-2): 61, 17305, 45357, 17863, 66 https://tests.stockfishchess.org/tests/view/680ec7f93629b02d74b16084 Failed Non-regression VVLTC: LLR: -2.94 (-2.94,2.94) <-1.75,0.25> Total: 313466 W: 80573 L: 81089 D: 151804 Ptnml(0-2): 38, 29689, 97782, 29199, 25 https://tests.stockfishchess.org/tests/view/6810d0533629b02d74b16756 official-stockfish#5907 official-stockfish#5887 Passed LTC with STC bounds: https://tests.stockfishchess.org/tests/view/67b115dd6c6b9e172ad1592f LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 75756 W: 19393 L: 19044 D: 37319 Ptnml(0-2): 60, 8251, 20913, 8588, 66 Passed LTC with LTC bounds: https://tests.stockfishchess.org/tests/view/67af5f5d6c6b9e172ad15765 LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 108126 W: 27880 L: 27412 D: 52834 Ptnml(0-2): 85, 11786, 29866, 12228, 98 Revert VVLTC w/ STC bounds: LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 56342 W: 14536 L: 14246 D: 27560 Ptnml(0-2): 7, 5061, 17741, 5359, 3 https://tests.stockfishchess.org/tests/view/67be4f8ad8d5c2c657c52d10 Revert VVLTC w/ LTC bounds: LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 66562 W: 17364 L: 17016 D: 32182 Ptnml(0-2): 3, 6145, 20637, 6493, 3 https://tests.stockfishchess.org/tests/view/67bcd25ff6b602bd7222ea40 no functional change
It has been repeatedly shown that such tunes are suspectible to become anti-scaling. Below are some recent examples: 2e91a86 d11f49b Passed STC: LLR: 2.95 (-2.94,2.94) <0.00,2.00> Total: 32448 W: 8651 L: 8342 D: 15455 Ptnml(0-2): 81, 3695, 8408, 3914, 126 https://tests.stockfishchess.org/tests/view/6899489b0049e8ccef9d64ad Passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,2.50> Total: 73854 W: 19042 L: 18649 D: 36163 Ptnml(0-2): 37, 7908, 20659, 8271, 52 https://tests.stockfishchess.org/tests/view/689abbe7fd8719b088c8d514 Revert VVLTC with STC bound: LLR: 2.95 (-2.94,2.94) <0.00,2.00> Total: 53802 W: 14030 L: 13740 D: 26032 Ptnml(0-2): 5, 4924, 16754, 5212, 6 https://tests.stockfishchess.org/tests/view/68a9a9f575da51a345a5a675 Revert VVLTC with LTC bound: LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 53658 W: 14022 L: 13699 D: 25937 Ptnml(0-2): 3, 4894, 16712, 5217, 3 https://tests.stockfishchess.org/tests/view/68a8d2b2b6fb3300203bca77 https://tests.stockfishchess.org/tests/view/688cf38bf17748b4d23c8057 https://tests.stockfishchess.org/tests/view/6890bc7792fcad741b804a19 Passed LTC: LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 74928 W: 19466 L: 19071 D: 36391 Ptnml(0-2): 37, 8048, 20901, 8439, 39 Failed Non-regression VLTC: LLR: -2.94 (-2.94,2.94) <-1.75,0.25> Total: 57704 W: 14643 L: 14928 D: 28133 Ptnml(0-2): 5, 5925, 17280, 5634, 8 https://tests.stockfishchess.org/tests/view/6890bc7792fcad741b804a19 (Note that an STC-tuned version passed non-regression, but was shortly simplified) #6040 Passed STC: LLR: 2.93 (-2.94,2.94) <0.00,2.00> Total: 27776 W: 7352 L: 7054 D: 13370 Ptnml(0-2): 68, 3126, 7221, 3386, 87 https://tests.stockfishchess.org/tests/view/680ec0f83629b02d74b1605b Passed LTC: LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 161304 W: 41432 L: 40864 D: 79008 Ptnml(0-2): 61, 17305, 45357, 17863, 66 https://tests.stockfishchess.org/tests/view/680ec7f93629b02d74b16084 Failed Non-regression VVLTC: LLR: -2.94 (-2.94,2.94) <-1.75,0.25> Total: 313466 W: 80573 L: 81089 D: 151804 Ptnml(0-2): 38, 29689, 97782, 29199, 25 https://tests.stockfishchess.org/tests/view/6810d0533629b02d74b16756 #5907 #5887 Passed LTC with STC bounds: https://tests.stockfishchess.org/tests/view/67b115dd6c6b9e172ad1592f LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 75756 W: 19393 L: 19044 D: 37319 Ptnml(0-2): 60, 8251, 20913, 8588, 66 Passed LTC with LTC bounds: https://tests.stockfishchess.org/tests/view/67af5f5d6c6b9e172ad15765 LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 108126 W: 27880 L: 27412 D: 52834 Ptnml(0-2): 85, 11786, 29866, 12228, 98 Revert VVLTC w/ STC bounds: LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 56342 W: 14536 L: 14246 D: 27560 Ptnml(0-2): 7, 5061, 17741, 5359, 3 https://tests.stockfishchess.org/tests/view/67be4f8ad8d5c2c657c52d10 Revert VVLTC w/ LTC bounds: LLR: 2.94 (-2.94,2.94) <0.50,2.50> Total: 66562 W: 17364 L: 17016 D: 32182 Ptnml(0-2): 3, 6145, 20637, 6493, 3 https://tests.stockfishchess.org/tests/view/67bcd25ff6b602bd7222ea40 closes #6284 no functional change
Scale history bonus and malus dependent on 9 conditions. The factors are tuned first at STC and on top of this at LTC (here the values after half of tuning were used):
Passed STC:
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 27776 W: 7352 L: 7054 D: 13370
Ptnml(0-2): 68, 3126, 7221, 3386, 87
https://tests.stockfishchess.org/tests/view/680ec0f83629b02d74b1605b
Passed LTC:
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 161304 W: 41432 L: 40864 D: 79008
Ptnml(0-2): 61, 17305, 45357, 17863, 66
https://tests.stockfishchess.org/tests/view/680ec7f93629b02d74b16084
Bench: 2067412