Skip to content

Optimizers can't be moved to a different GPU #1442

@gregjohnso

Description

@gregjohnso

There is currently no good way to set the GPU id for an optimizer state. This is particuarly relevant in the use case that model training stops, and needs to be restarted on a different gpu.

I'm currently working around that problem with the following:

def set_gpu_recursive(var, gpu_id):
    for key in var:
        if isinstance(var[key], dict):
            var[key] = set_gpu_recursive(var[key], gpu_id)
        else:
            try:
                var[key] = var[key].cuda(gpu_id)
            except:
                pass
    return var

opt.load_state_dict(torch.load(opt_save_path))
opt.state = set_gpu_recursive(opt.state, gpu_id)

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureA request for a proper, new feature.triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions