-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[vulkan] inplace add_, relu_ #41380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[vulkan] inplace add_, relu_ #41380
Conversation
[ghstack-poisoned]
[ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 5ee5ca8 (more details on the Dr. CI page):
🚧 6 fixed upstream failures:These were probably caused by upstream breakages that were already fixed.
Please rebase on the
|
[ghstack-poisoned]
Differential Revision: [D22754939](https://our.internmc.facebook.com/intern/diff/D22754939) [ghstack-poisoned]
Differential Revision: [D22754939](https://our.internmc.facebook.com/intern/diff/D22754939) [ghstack-poisoned]
Differential Revision: [D22754939](https://our.internmc.facebook.com/intern/diff/D22754939) [ghstack-poisoned]
Differential Revision: [D22754939](https://our.internmc.facebook.com/intern/diff/D22754939) [ghstack-poisoned]
Differential Revision: [D22754939](https://our.internmc.facebook.com/intern/diff/D22754939) [ghstack-poisoned]
Differential Revision: [D22754939](https://our.internmc.facebook.com/intern/diff/D22754939) [ghstack-poisoned]
|
|
||
| Tensor& vulkan_add_(Tensor& self, const Tensor& other, Scalar alpha) { | ||
| auto& x = vtensor_from_vulkan(self.is_vulkan() ? self : self.vulkan()); | ||
| auto& y = vtensor_from_vulkan(other.is_vulkan() ? other : other.vulkan()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const. I think vtensor_from_vulkan should be overloaded on const.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add const override for vtensor_from_vulkan to this PR
| output.allocate_storage(); | ||
| vulkan::detail::add(output, x, y, a); | ||
| x = std::move(output); | ||
| return self; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this logic here. If self is already allocated and the goal of this function is to implement c += a * b, then do we need to allocate output as well? Can we not do it in place in 'self'? Also, why the move? Can't we just pass output, output, and input and a (in that order) to vulkan::detail::add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we read from x (self) and y (other) and write output values to output.
At the moment all shaders work having separate output and input(s), here it is using the same shader that is used for copy variant of this op.
x = std::move(output) replaces x (self) content (pointer to VulkanTensor::Impl) of x to the result of output.
vulkan::detail::add() uses only image() representation, so we do not really need allocated buffer().
For this implementation we need output.allocate_storage(), but with changes #42569 allocate_storage can be removed
| x, | ||
| min ? min.value().to<float>() : -std::numeric_limits<float>::infinity(), | ||
| max ? max.value().to<float>() : std::numeric_limits<float>::infinity()); | ||
| x = std::move(output); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same. If this operation is in place, then why do we need to allocate_storage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(As in previous comment) Output VulkanTensor is used here as a temporary variable to reuse shaders with separate output, inputs; allocate_storage to remove in #42569 :)
Differential Revision: [D22754939](https://our.internmc.facebook.com/intern/diff/D22754939) [ghstack-poisoned]
Differential Revision: [D22754939](https://our.internmc.facebook.com/intern/diff/D22754939) [ghstack-poisoned]
Differential Revision: [D22754939](https://our.internmc.facebook.com/intern/diff/D22754939) [ghstack-poisoned]
Differential Revision: [D22754939](https://our.internmc.facebook.com/intern/diff/D22754939) [ghstack-poisoned]
Differential Revision: [D22754939](https://our.internmc.facebook.com/intern/diff/D22754939) [ghstack-poisoned]
|
@IvanKobzarev merged this pull request in 5dd230d. |
Stack from ghstack:
Differential Revision: D22754939