Block set from param_group['params'] #6031

Jiaming-Liu · 2018-03-27T01:27:57Z

This might cause list(params) to output in random order. In this case, in load_state_dict(), keys & values of id_map would not be matched correctly.

This might cause `list(params)` to output in random order. In this case, in `load_state_dict()`, `id_map` would not be matched correctly.

Jiaming-Liu · 2018-03-27T01:38:06Z

Just to make it easier to understand:

optim = torch.optim.Adam(set(p for p in model.parameters()))

should be avoided.

A reasonable use case:

biases = set(param for name, param in model.named_parameters() if 'bias' in name)
weights = [p for p in model.parameters() if p not in biases]  # make biases a set to accelerate `in`
groups = [
    dict(params=weights, lr=0.1, weight_decay=5e-4),
    dict(params=biases, lr=0.2, weight_decay=0)
]
optim = torch.optim.Adam(groups)

This might raise error after optim.load_state_dict(), and it is very hard to debug.

Traceback (most recent call last):
  File "xxxxxxxxxxxxxx.py", line 29, in <module>
    optim.step()
  File "/xxxxxxxxxxxxxx/site-packages/torch/optim/adam.py", line 69, in step
    exp_avg.mul_(beta1).add_(1 - beta1, grad)
RuntimeError: inconsistent tensor size, expected xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

ezyang · 2018-03-27T14:41:20Z

@pytorchbot test this please

Looks reasonable to me

Jiaming-Liu · 2018-03-27T17:34:11Z

maybe add some suggestions in the error msg?

apaszke · 2018-03-27T18:00:17Z

I'd suggest this: "Optimizer parameters need to be organized in ordered collections, but the ordering of tensors in sets will change between runs. Please use a list instead."

Also, can you please add a warning that parameters need to give a deterministically ordered iterator in the optim docs? Thanks!

apaszke · 2018-03-27T21:04:43Z

@pytorchbot test this please

Jiaming-Liu · 2018-03-27T22:46:51Z

warning added. but i havent got the time to compile & see

Jiaming-Liu · 2018-03-27T22:48:25Z

@pytorchbot test this please

codinfox · 2018-03-27T23:07:55Z

Nice work.

One little suggestion to the pytorch team: I think it would be better if we can assign each parameter a unique identifier (based on its hierarchy in the graph). By current design, the optimizer.load_state_dict() function assumes that the order of the stored state_dict is the same as the order currently defined in the network. This design is very fragile and error-prone. I would prefer this function to be implemented in a way like module.load_state_dict(), which does not rely on the order. Is there any reason why pytorch does not assign identifiers to parameters?

Jiaming-Liu · 2018-03-28T00:04:03Z

@codinfox i agree with that but it's hard to find sth other than name. even hierarchy can make things nontransparent and fragile.

codinfox · 2018-03-28T00:34:05Z

@Jiaming-Liu Yeah, name is a good identifier.

apaszke · 2018-03-28T09:08:09Z

@codinfox We don't do that just because there's no good way to get identifiers in a deterministic way. I guess we could extend the optimizer API to accept named lists of parameters, but we also need to keep the current API.

apaszke · 2018-03-28T10:02:01Z

@pytorchbot test this please

soumith · 2018-03-28T14:45:27Z

thanks @Jiaming-Liu !

Whu-wxy · 2019-10-04T02:05:26Z

I met this problem when load state from check point. How can I solve this problem?

File "/xxxxxxxxxxxxxx/site-packages/torch/optim/adam.py", line 69, in step
    exp_avg.mul_(beta1).add_(1 - beta1, grad)
RuntimeError: The size of tensor a (256) must match the size of tensor b (1024) at 
non-singleton dimension 1

Block set from param_group['params']

1ae1792

This might cause `list(params)` to output in random order. In this case, in `load_state_dict()`, `id_map` would not be matched correctly.

Update Error Message

fa59ebc

apaszke approved these changes Mar 27, 2018

View reviewed changes

Add Warning on Optimizer Docs

e3c2a5f

Update optimizer.py

ac0a00f

apaszke approved these changes Mar 28, 2018

View reviewed changes

soumith merged commit 31c0e23 into pytorch:master Mar 28, 2018

ezyang added the open source label Jun 24, 2019

Block set from param_group['params'] #6031

Block set from param_group['params'] #6031

Uh oh!

Conversation

Jiaming-Liu commented Mar 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jiaming-Liu commented Mar 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang commented Mar 27, 2018

Uh oh!

Jiaming-Liu commented Mar 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apaszke commented Mar 27, 2018

Uh oh!

apaszke commented Mar 27, 2018

Uh oh!

Jiaming-Liu commented Mar 27, 2018

Uh oh!

Jiaming-Liu commented Mar 27, 2018

Uh oh!

codinfox commented Mar 27, 2018

Uh oh!

Jiaming-Liu commented Mar 28, 2018

Uh oh!

codinfox commented Mar 28, 2018

Uh oh!

apaszke commented Mar 28, 2018

Uh oh!

apaszke commented Mar 28, 2018

Uh oh!

soumith commented Mar 28, 2018

Uh oh!

Whu-wxy commented Oct 4, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Jiaming-Liu commented Mar 27, 2018 •

edited

Loading

Jiaming-Liu commented Mar 27, 2018 •

edited

Loading

Jiaming-Liu commented Mar 27, 2018 •

edited

Loading