Skip to content

Conversation

@peterbell10
Copy link
Collaborator

@peterbell10 peterbell10 commented Aug 21, 2020

Stack from ghstack:

Differential Revision: D23298652

@dr-ci
Copy link

dr-ci bot commented Aug 21, 2020

💊 CI failures summary and remediations

As of commit 84b1711 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 38 times.

peterbell10 added a commit that referenced this pull request Aug 21, 2020
@peterbell10 peterbell10 requested a review from zou3519 August 22, 2020 16:01
@ngimel
Copy link
Collaborator

ngimel commented Aug 25, 2020

cc @neerajprad @fritzo for changes to distribution tests. My impression was that broadcasted tensors are routinely used in distributions, so requiring non-self-overlapping outputs could lead to significant memory penaly.

@ngimel ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 25, 2020
value, probs = broadcast_all(value, self.probs.clone(memory_format=torch.contiguous_format))
value, probs = broadcast_all(value, self.probs)
probs = probs.clone(memory_format=torch.contiguous_format)
probs[(probs == 1) & (value == 0)] = 0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My impression was that broadcasted tensors are routinely used in distributions, so requiring non-self-overlapping outputs could lead to significant memory penaly.

This is a bug fix. If probs is broadcasted here, then this line can write to locations where value != 0 if it happens to overlap with a location where value is zero. The full tensor size will be required in the following calculation anyway, so I think this is fine.

The same reasoning applies to the change in multinomial.

peterbell10 added a commit that referenced this pull request Aug 25, 2020
peterbell10 added a commit that referenced this pull request Aug 25, 2020
peterbell10 added a commit that referenced this pull request Aug 26, 2020
peterbell10 added a commit that referenced this pull request Aug 27, 2020
peterbell10 added a commit that referenced this pull request Aug 29, 2020
peterbell10 added a commit that referenced this pull request Aug 29, 2020
@codecov
Copy link

codecov bot commented Aug 29, 2020

Codecov Report

❗ No coverage uploaded for pull request base (gh/peterbell10/9/base@bc2fcac). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@                   Coverage Diff                    @@
##             gh/peterbell10/9/base   #43423   +/-   ##
========================================================
  Coverage                         ?   69.31%           
========================================================
  Files                            ?      378           
  Lines                            ?    46747           
  Branches                         ?        0           
========================================================
  Hits                             ?    32405           
  Misses                           ?    14342           
  Partials                         ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc2fcac...84b1711. Read the comment docs.

Copy link
Collaborator

@fritzo fritzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for fixing this!

@ngimel I believe the only place where broadcasting would help reduce memory footprint is in MyDist(...),log_prob(), but those already perform arithmetic operations like (-probs).log1p() so I believe max extra overhead of this PR would be +25%.

@fritzo
Copy link
Collaborator

fritzo commented Aug 31, 2020

Can someone else please review the added at::assert_no_overlap()s?

@zou3519
Copy link
Contributor

zou3519 commented Aug 31, 2020

Can someone else please review the added at::assert_no_overlap()s?

I can review it after I finish reviewing #43422

peterbell10 added a commit that referenced this pull request Aug 31, 2020
Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some not-uncommon use cases of masked_fill_ and index_put_ are idempotent. For example, I've seen user code out in the wild that tries to zero-out nans that does the following, but I don't know how many users actually try to do this on broadcasted tensors.

x[torch.isnan(x)] = 0
y.masked_fill_(torch.isnan(y), 0)

The other operators in this PR look fine, though.

My opinion is to not change masked_fill_ and index_put_, at least not in this PR, because it is a common paradigm to do x[x == blah] = value. If we do decide we want to fix masked_fill_ and index_put_, I think we should consider that change to be BC-breaking and have a deprecation cycle.

@peterbell10
Copy link
Collaborator Author

peterbell10 commented Sep 1, 2020

The original issue (#39639) was originally about masked_fill_ specifically, so I don't want to ignore it completely here. I've lowered it to a deprecation warning. But I could move that to another PR as well if you prefer.

Comment on lines 297 to 298
TORCH_WARN("Use of index_put_ on expanded tensors is deprecated. "
"Please clone() the tensor before performing this operation.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to get here is via advanced indexing. We should talk about that as well in the error message so that the user can connect these two ops. "Use of index_put_ on expanded tensors is deprecated. Please clone() the tensor before performing this operation. You may also encounter this error message when assigning items to a tensor using advanced indexing, that is also deprecated"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Added a note in the warning for index_put_ and also masked_fill_ as well.

@zou3519
Copy link
Contributor

zou3519 commented Sep 1, 2020

The original issue (#39639) was originally about masked_fill_ specifically, so I don't want to ignore it completely here. I've lowered it to a deprecation warning. But I could move that to another PR as well if you prefer.

Sure, let's do the deprecation warning in this PR then

Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a mention of advanced indexing into the index_put_ deprecation. After that this should be good to go

peterbell10 added a commit that referenced this pull request Sep 1, 2020
@facebook-github-bot
Copy link
Contributor

@zou3519 merged this pull request in c88ac25.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: deprecation open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants