-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Move backward and set_data off of Type #21963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
|
|
||
| //example | ||
| //Tensor * add(Tensor & b); | ||
| void backward(const Tensor & gradient={}, bool keep_graph=false, bool create_graph=false) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original signature was intentionally optional<Tensor>, to make it clear that passing nullopt is valid (it's not obviously valid for other cases.) Does the codegen choke on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we've been planning to make Tensor? translate to optional<Tensor> for awhile now -- maybe now is a good time to do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this doesn't work on codegen yet.
ezyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signature change to backward seems harmless enough since it was defaulted, but I would like to understand a little more. Everything else looks fine.
torch/csrc/autograd/variable.cpp
Outdated
| std::vector<Variable> inputs; | ||
| if (!gradient.has_value()) { | ||
| gradient = at::ones_like(*this); | ||
| Tensor gradient_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, where do you assign gradient to gradient_?
| } | ||
|
|
||
| THPUtils_assertRet(-1, grad.type() == var.type() || gradIsSparse, | ||
| THPUtils_assertRet(-1, grad.type() == var.type() || grad.is_sparse(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless I'm missing something, this doesn't look equivalent? The new code allows you to assign a sparse gradient, even if it's the wrong dtype.
| const Tensor& gradient, | ||
| bool keep_graph, | ||
| bool create_graph) const { | ||
| bool create_graph) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there some reason the const disappeared on this backward and set_data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are static functions now.
| // all static inline to allow for inlining of the non-dynamic part of dispatch | ||
| inline void Tensor::backward(const Tensor & gradient, bool keep_graph, bool create_graph) const { | ||
| static auto table = globalATenDispatch().getOpTable("aten::backward(Tensor(a!) self, Tensor? gradient=None, bool keep_graph=False, bool create_graph=False) -> void"); | ||
| return table->getOp<void (const Tensor &, const Tensor &, bool, bool)>(tensorTypeIdToBackend(type_id()), is_variable())(*this, gradient, keep_graph, create_graph); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool.
| - func: _cast_Half(Tensor self, bool non_blocking=False) -> Tensor | ||
| variants: function | ||
|
|
||
| - func: backward(Tensor(a!) self, Tensor? gradient=None, bool keep_graph=False, bool create_graph=False) -> void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how did you choose these annotations? For example, why is backward's self marked as writeable, but set_data isn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I guess I have these reversed. I guess technically set_data should be set_data_, but not sure I want to make that change here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not even really sure what these annotations mean in this context, as they are pretty different than the usual meanings.
For example, backward is the only place where we mutate self.grad -- does that count as mutable? self itself doesn't actually change.
Maybe @suo can give some guidance here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
backward() is a special case that isn't expressible in the alias annotation language—it writes to every tensor in the autograd graph as the gradients are accumulated to the leaves. So if we enabled in the JIT it would have to be a special case in the alias analysis pass.
The bit about mutating the grad property only isn't that relevant—for the purpose of optimization, you can't reorder a call to backward() across a read of any tensor in the autograd graph anyway, so it's fine to consider self.grad part of the self's memory.
gchanan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments.
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Move backward and set_data off of Type gh-metadata: pytorch pytorch 21963 gh/li-roy/31/head
Summary: Pull Request resolved: pytorch/pytorch#21963 ghimport-source-id: 4d9d66ba2c8587503d892b67f535cc2a62e2d19e Test Plan: Imported from OSS Differential Revision: D15897423 Pulled By: li-roy fbshipit-source-id: 2dd55ceb80971df7c86545b7bfff733387f13572
Summary: Pull Request resolved: pytorch#21963 ghimport-source-id: 4d9d66b Test Plan: Imported from OSS Differential Revision: D15897423 Pulled By: li-roy fbshipit-source-id: 2dd55ceb80971df7c86545b7bfff733387f13572
| gradIsSparse = gradType == sparseType; | ||
| } | ||
|
|
||
| bool gradIsSparse = var.dtype() == grad.dtype() && toSparse(tensorTypeIdToBackend(var.type_id())) == tensorTypeIdToBackend(grad.type_id()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@li-roy this change is breaking backend extensions in that when the var backend type is a backend extension type, toSparse() just throws a runtime error. https://github.com/pytorch/pytorch/blob/master/c10/core/Backend.h#L31
Stack from ghstack:
These functions only work on variables; calling it on a Tensor errors out. I moved the Tensor error stubs to native functions, and changed the variable implementations to be manual overrides in VariableType.
Differential Revision: D15897423