`backward` hangs in multiprocess after single-process

1. Call `backward` on something in the local process
2. Launch N subprocess that call `backward`
3. hang

Here's a repro:

```
import torch
import torch.nn as nn
from torch.autograd import Variable
import torch.multiprocessing as mp


def train():
    x = torch.randn(1000, 1)
    y = torch.randn(1000, 1)
    model = nn.Linear(1, 1)
    mse = nn.MSELoss()
    model.zero_grad()
    pred = model(Variable(x))
    loss = mse(pred, Variable(y))
    loss.backward() # hangs here
    return model


def worker(rank):
    print("rank %d start" % rank)
    model = train()
    print("rank %d done" % rank)


def run_distributed(N):
    ps, models = [], []

    for rank in range(10):
        p = mp.Process(target=worker,
                       args=(rank,))
        p.start()
        ps.append(p)


    for p in ps:
        p.join()

    return models

train()  # run_distributed hangs unless this line is commented
print("Done main train")
run_distributed(5)
print("Done distributed train")
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`backward` hangs in multiprocess after single-process #3966

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

backward hangs in multiprocess after single-process #3966

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`backward` hangs in multiprocess after single-process #3966