Skip to content

Conversation

@ssnl
Copy link
Collaborator

@ssnl ssnl commented Nov 5, 2018

In broadcast_coalesced, since multiple variables can be "views" of a big flattened tensor, they can share the same version counter. However, this base flat tensor is not exposed and they don't share any memory locations, so this is not necessary. Furthermore, it can cause problems, e.g., when two buffers are broadcast together in DataParallel and one of them is modified in-place during forward but the other is needed in backward, autograd engine will complain.

Fixing the bug discovered at #13350 (comment)

edit: This is a very real problem. E.g., consider using Spectral Norm + Batch Norm together.

// See NOTE [ Version Counter in comm.*_coalesced ]
AT_ASSERT(t.is_variable());
Variable var = t;
device_outputs.push_back(std::move(make_variable(var.data(), false)));

This comment was marked as off-topic.

// See NOTE [ Version Counter in comm.*_coalesced ]
AT_ASSERT(t.is_variable());
Variable var = t;
device_outputs.push_back(std::move(make_variable(var.data(), false)));

This comment was marked as off-topic.

@ssnl ssnl closed this Nov 7, 2018
@ssnl ssnl reopened this Nov 7, 2018
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssnl is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants