Refined loss function #205

vondele · 2022-07-05T16:26:15Z

this refines the loss function to the form used for the new master net in official-stockfish/Stockfish#4100

The new loss function uses the expect game score to learn,
making the the learning more sensitive to those scores between loss and draw, draw and win.

Most visible for smaller values of the scaling parameter, but the current ones have been optimized.

it also introduces param_index for simpler explorations of paramers, i.e. simple parameter scans.

Sopel97 · 2022-07-05T16:37:43Z

The --param_index is needlessly limiting in my opinion, moreso considering it's not being used right now. If we want to keep it it would be better to expose in/out scaling as cli parameters directly. I'm fine with the rest.

this refines the loss function to the form used for the new master net in official-stockfish/Stockfish#4100 The new loss function uses the expect game score to learn, making the the learning more sensitive to those scores between loss and draw, draw and win. Most visible for smaller values of the scaling parameter, but the current ones have been optimized. it also introduces param_index for simpler explorations of paramers, i.e. simple parameter scans.

vondele · 2022-07-05T16:39:33Z

Ah, wait, I didn't commit the needed cleanup changes..

vondele · 2022-07-05T16:41:36Z

The idea of the param_index is just to simplify exploration, I don't think we need to expose everything as parameters, but it is quite tedious to pass something down from the cli to the core (ideally even to the C++ data reader).

Maybe there is a more elegant approach to doing so ?

vondele · 2022-07-05T16:42:08Z

BTW, the previous state of the commit showed, how I have been using it.

Sopel97 · 2022-07-05T17:15:09Z

That makes sense now. Indeed, I've been usually adding a special parameter for stuff like that (usually varied per run). Looks good to me now.

snicolet · 2022-07-08T06:54:24Z

Do we have any picture(s) to help visualize the idea behind the double sigmoid for the loss function, and the effect of param_index?

vondele · 2022-07-08T07:01:45Z

param_index is just 'a hack' to be able to twiddle with parameters that we don't want to expose, so can be ignored. The following illustrates the idea:

gnuplot> sigmoid(x)=1/(1+exp(-x))
gnuplot> win_rate(x, b)=sigmoid((x-270) / b)
gnuplot> score(x, b)=0.5 * ( 1.0 + win_rate(x, b) - win_rate(-x, b))
gnuplot> set xlabel "eval"
gnuplot> set ylabel "score"
gnuplot> set xrange[-800:800]
gnuplot> plot score(x, 50), score(x, 300)

vondele · 2022-07-08T07:10:59Z

To make this more intuitive, we know games are quite won near 135 cp (270 in the above internal units), and quite lost near -135cp. They are also most likely draw in an interval around 0cp. This is reflected in the step like behavior for smaller b. The score formula is just a result of taking the win_rate_model, and computing the probability of loss, draw, win. The 'b' parameter describes how smoothly that transition is. For the high quality games in fishtest it is about 30-50, for the quick training data maybe more close to 300?.

see also the plots in official-stockfish/Stockfish#3981

vondele force-pushed the lossScan4PR branch from 309ab0c to b6865d5 Compare July 5, 2022 16:39

Sopel97 merged commit e3f70f2 into official-stockfish:master Jul 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refined loss function #205

Refined loss function #205

Uh oh!

vondele commented Jul 5, 2022

Uh oh!

Sopel97 commented Jul 5, 2022

Uh oh!

vondele commented Jul 5, 2022

Uh oh!

vondele commented Jul 5, 2022

Uh oh!

vondele commented Jul 5, 2022

Uh oh!

Sopel97 commented Jul 5, 2022

Uh oh!

snicolet commented Jul 8, 2022 •

edited

Loading

Uh oh!

vondele commented Jul 8, 2022

Uh oh!

vondele commented Jul 8, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Refined loss function #205

Refined loss function #205

Uh oh!

Conversation

vondele commented Jul 5, 2022

Uh oh!

Sopel97 commented Jul 5, 2022

Uh oh!

vondele commented Jul 5, 2022

Uh oh!

vondele commented Jul 5, 2022

Uh oh!

vondele commented Jul 5, 2022

Uh oh!

Sopel97 commented Jul 5, 2022

Uh oh!

snicolet commented Jul 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vondele commented Jul 8, 2022

Uh oh!

vondele commented Jul 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

snicolet commented Jul 8, 2022 •

edited

Loading

vondele commented Jul 8, 2022 •

edited

Loading