Use winning_percentage_wdl in learn #90

tttak · 2020-08-24T15:01:23Z

こんばんは。
以前のプルリクエスト（ #36 ）と同様に
評価関数の学習時にWDLを使うオプションを追加してみました。
learn use_wdl 1 ... のように指定すると、sigmoidを使うwinning_percentage()の代わりに
win rate modelによるWDLを用いた勝率を使うようになります。

以前のプルリクエストの際はクロスエントロピーの微分が難しく、一旦Revertして頂きました。
今回の実装では解析的な微分はあきらめて、数値微分にしました。

デバッグ用に、（WDLではなく）通常のsigmoidによる勝率の場合に
数値微分で計算した結果と、解析的な微分結果である(q - p)や(q - t)を比べると
誤差の範囲でほぼ一致することを確認しました。
（スケールをあわせるため、数値微分の結果をwinning_probability_coefficientで割っています）

最近導入されたconvert_teacher_signal_to_winning_probabilityが指定された場合にも対応したつもりです。
なお、LOSS_FUNCTIONとしては、今回はLOSS_FUNCTION_IS_ELMO_METHODにだけ対応しました。

ただ、手元で少し学習させてみましたところ、学習にはあまり効果がないようでした。（というより、逆効果でした）

※現在、機械学習バイナリのリリースに向けて、Stockfish masterマージ後のリグレッションテストを実施されている旨、twitterで拝見しました。
　このプルリクエストをマージして頂ける場合、次回リリース後で構いません。

nodchip · 2020-08-24T15:19:27Z

Thank you for the pull request. I'm planning to release a new binary, and a regression test is on going. I will merge this pull request after a new binary set is released.

nodchip · 2020-08-27T14:52:09Z

I decided to postpone the release because there is a critical bug in the training data generator and the trainer. I want to merge this pull request before the next release. Could you please resolve the conflict?

tttak · 2020-08-28T00:27:51Z

OK.
I resolved the conflict and committed.

nodchip · 2020-08-28T01:06:54Z

Thank you for resolving the conflict. I merged it.

Use winning_percentage_wdl in learn

4ce30d9

Merge branch 'nodchip_master' into wdl_20200824

03b4307

nodchip merged commit def6ec4 into nodchip:master Aug 28, 2020

tttak mentioned this pull request Sep 8, 2020

Learner cleanup #118

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use winning_percentage_wdl in learn #90

Use winning_percentage_wdl in learn #90

Uh oh!

tttak commented Aug 24, 2020

Uh oh!

nodchip commented Aug 24, 2020

Uh oh!

nodchip commented Aug 27, 2020

Uh oh!

tttak commented Aug 28, 2020

Uh oh!

nodchip commented Aug 28, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Use winning_percentage_wdl in learn #90

Use winning_percentage_wdl in learn #90

Uh oh!

Conversation

tttak commented Aug 24, 2020

Uh oh!

nodchip commented Aug 24, 2020

Uh oh!

nodchip commented Aug 27, 2020

Uh oh!

tttak commented Aug 28, 2020

Uh oh!

nodchip commented Aug 28, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants