Give broadcast_coalesced tensors different version counters #13594

ssnl · 2018-11-05T21:45:42Z

In broadcast_coalesced, since multiple variables can be "views" of a big flattened tensor, they can share the same version counter. However, this base flat tensor is not exposed and they don't share any memory locations, so this is not necessary. Furthermore, it can cause problems, e.g., when two buffers are broadcast together in DataParallel and one of them is modified in-place during forward but the other is needed in backward, autograd engine will complain.

Fixing the bug discovered at #13350 (comment)

edit: This is a very real problem. E.g., consider using Spectral Norm + Batch Norm together.

torch/csrc/cuda/comm.cpp

+          // See NOTE [ Version Counter in comm.*_coalesced ]
+          AT_ASSERT(t.is_variable());
+          Variable var = t;
+          device_outputs.push_back(std::move(make_variable(var.data(), false)));


torch/csrc/cuda/comm.cpp

+          // See NOTE [ Version Counter in comm.*_coalesced ]
+          AT_ASSERT(t.is_variable());
+          Variable var = t;
+          device_outputs.push_back(std::move(make_variable(var.data(), false)));


facebook-github-bot

@ssnl is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ssnl mentioned this pull request Nov 5, 2018

Fix more spectral norm bugs #13350

Closed

ssnl force-pushed the bc_ver branch from 4098370 to c55b54b Compare November 5, 2018 22:09

ssnl added 3 commits November 5, 2018 18:42

Give broadcast_coalesced tensors different version counters

d981e77

Add test, fix reduce_add_coalesced too

6c0c1e6

one more comment

5813a1a

ssnl force-pushed the bc_ver branch from c55b54b to 5813a1a Compare November 5, 2018 23:43

colesbury approved these changes Nov 7, 2018

View reviewed changes

remove std::move

799cf86

ssnl closed this Nov 7, 2018

ssnl reopened this Nov 7, 2018

facebook-github-bot reviewed Nov 7, 2018

View reviewed changes

facebook-github-bot closed this in 2448a83 Nov 8, 2018

ssnl deleted the bc_ver branch November 8, 2018 06:48

ezyang added open source merged labels Jun 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Give broadcast_coalesced tensors different version counters #13594

Give broadcast_coalesced tensors different version counters #13594

Uh oh!

ssnl commented Nov 5, 2018 •

edited

Loading

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

facebook-github-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Give broadcast_coalesced tensors different version counters #13594

Give broadcast_coalesced tensors different version counters #13594

Uh oh!

Conversation

ssnl commented Nov 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ssnl commented Nov 5, 2018 •

edited

Loading