-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Expose raw saved tensors for custom functions #60551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit a7ccff0 (more details on the Dr. CI page and at hud.pytorch.org/pr/60551):
🕵️ 2 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
soulitzer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok, the main issue I think is how to deal with the case when backward already frees the SavedVariable. Maybe theres some way to do this in pybind, but if we use the plain old PyObject struct, we can have it hold a weak_ptr and be able to raise an error if c++ side is already freed
Thanks for the review! I can rebase to include it in this PR. |
|
Np! |
|
Removing myself as @soulitzer is handling this one. |
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Good point. My proposal is the following. |
|
That's a good plan - but there are also some cases where the entire SavedVariable is destroyed no (not just its data field)? |
|
While I do agree with you, I've written tests to test those cases (here and here). What is confusing to me is that the following code does trigger class MyFn(Function):
@staticmethod
def forward(ctx, x):
ctx.save_for_backward(x)
return x
@staticmethod
def backward(ctx, g):
return g, None
y = MyFn.apply(torch.ones(5, requires_grad=True))
t = y.grad_fn.raw_saved_tensors
y.sum().backward() # calls grad_fn->release_variables() and the destructors for each SavedVariable (but not `.reset()`)
t[0].register_hooks(lambda x: x, lambda x: x) |
|
It could be because vector.clear just updates its size, but capacity is required to stays the same, and thus the memory is still reserved by the vector. |
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 06d3801 Pull Request resolved: pytorch#60551
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
soulitzer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
albanD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of small comments.
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
|
@Varal7 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D29466228](https://our.internmc.facebook.com/intern/diff/D29466228) [ghstack-poisoned]
|
@Varal7 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
This pull request has been merged in f54290f. |
Stack from ghstack:
Summary: This PR is part of a stack aimed to address #58512.
Provide bindings for the
SavedVariableclass (for custom functions only). TheSavedTensorobject can not be created from Python and currently does not expose any method.Undefined behavior can arise if the user decides to keep a reference to this
SavedTensorafter the underlyingSavedVariablehas been freed.Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D29466228