Skip to content

Conversation

@apaszke
Copy link
Contributor

@apaszke apaszke commented Nov 12, 2017

Right now optimizers can load state dicts of other optimizers only if all parameters are matching in type and device (in contrast to nn.Modules). This is too strict for many use cases, and is addresses in this patch.

The only problem is that optimizer state isn't typed in any way, so code from this PR tries to make reasonable guesses - only state that's bound to certain parameters is casted (with parameter being the template), and we assume that floating point tensors in the state should match the type of parameter (I can't think of better way to handle load_state_dict across sets of parameters with different fp types). All other types are only moved to a different device.

Fixes #2830, #1442.

@apaszke apaszke requested a review from colesbury November 16, 2017 16:47
Copy link
Member

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

optimizer load_state_dict() problem?

4 participants