-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Land remaining parts of Torchscript Lazy Tensor backend #74111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CI Flow Status⚛️ CI FlowRuleset - Version:
|
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 4286cc8 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
8c30d56 to
ceef903
Compare
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
1 similar comment
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
ceef903 to
71fbb54
Compare
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
71fbb54 to
64d4e5f
Compare
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
64d4e5f to
0a867b4
Compare
0a867b4 to
f0ece35
Compare
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
f0ece35 to
3e14dd9
Compare
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
3e14dd9 to
4a3af07
Compare
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
1898fb4 to
b1b6051
Compare
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
b1b6051 to
0d59644
Compare
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
0d59644 to
1fb02d2
Compare
Summary: - per-operator-headers is a strict build mode where compulation units aren't allowed to depend on bulk headers like ATen/Functions.h, but must instead depend only on the specific operator headers used. (In other configurations, the reverse is required). Differential Revision: D35002666 fbshipit-source-id: a1747cf994fa1dc55095dceebe67cc819dee03d5
Summary: Also enables bazel build to run lazy codegen. Bazel (oss) build feeds off the same filelists as cmake/buck (build_variables.bzl), so enabling it is easier than keeping it disabled. Pull Request resolved: pytorch#74111 Test Plan: Run CI and verify test_lazy_ops is running via OSS cmake builds Differential Revision: D34772403 fbshipit-source-id: 55a3db507ef9f5bfbb13e8e4692984b2ad6fa065
1fb02d2 to
4286cc8
Compare
|
This pull request was exported from Phabricator. Differential Revision: D34772403 |
|
|
||
| // no longer test ::Lazy key here | ||
| // since it is now registered to TS backend in-tree and thus behaves differently, | ||
| // does not throw the expected 'could not run..' messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to think of why your changes would affect these tests, since they all use a custom operator (_test::dummy). And... it's probably because we now register a CPU fallback kernel to the Lazy key in core? It would probably be better if we just arranged for this test to get built without any of the dispatcher registrations running :p.
Either way, removing seems totally fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the fallback affects these.
actually, there is another PR out (#73973) which I wanted you to take a look at, and I plan to land after this. Once that's in, i think we could more easily run tests like this, however, it also doesn't seem like that important of a test to me given the other coverage we'll be adding
|
|
||
| // TODO(alanwaketan): Update the following unit tests once the TorchScript backend is merged. | ||
| #ifdef FBCODE_CAFFE2 | ||
| // Lazy Tensor is disabled in FBCODE until addressing non-virtual methods (e.g. sizes) in TensorImpl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I spoke to him briefly; his fix only fixed one method, and I think we need to fix all the other ones too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does comparing with nullptr mean LTC_TS_CUDA=0 would also set the device to kCUDA? I see we are using GetEnvBool in our branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, it is not ideal. I copied the way that LTC_.._dynamic shapes mode is set (see ir.cpp)
I think we should change this, but propose to unify them all in a later PR and have one system that lets us override any of the 'FLAGS_' variables using envs with a consistent naming.
| "#include <ATen/ops/empty.h>", | ||
| "#include <ATen/ops/empty_strided.h>"] | ||
| "#include <ATen/ops/empty_strided.h>", | ||
| "#include <ATen/ops/_copy_from_and_resize.h>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand how per_operator_headers works, so just curious why we need to add the extra headers here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tldr: we used to get away with #include <ATen/Functions.h>, which would include every tensor operator (easy for development, but bloats all of our C++ files). The per_operator_headers stuff basically generates a new header file for every aten operator, and enforces that you only include the headers for the operators that you actually need.
The codegen uses the metadata in native_functions.yaml to figure out which #includes to stamp out, but there are a few ops that are hardcoded directly into the RegisterDispatchKey.cpp template file (that we need to hardcode here, because the codegen would know to include them just based on the yaml).
My guess for why we needed that extra _copy_from_and_resize op is because the LTC codegen uses it directly (to handle inplace/out= ops)
| dispatch_key: DispatchKey, | ||
| selector: 'SelectiveBuilder') -> None: | ||
| selector: 'SelectiveBuilder', | ||
| build_in_tree: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the flag in gen_backend_stubs.py (I thought it wasn't used by LTC?)
Also, I think a light comment here on why we have build_in_tree would be nice. Basically, gen_backend_stubs.py was originally meant as a codegen pipeline for external backends that don't have the ability to take advantage of native_functions.yaml and our core codegen pipeline. And now that LTC is a part of core, it may or may not get built in-tree (depending on whether the backend that hooks into LTC is in-tree or not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh it's just because gen_lazy_tensor.py calls into the function in gen_backend_stubs.py (so if you're adding a new flag for the LTC codegen, the backend_stubs function needs to know about it). I guess that's fine, since the gen_backend_stubs.py entrypoint isn't affected - just the helper functions it uses. (Maybe just include in the comment here that it's only actually needed/used by LTC codegen)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment locally, but if my CI is clean this time I'm gonna land the comment in a separate diff haha
bdhirsh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the codegen, and skimmed the build system changes (but didn't look in detail at the LTC C++ code changes). Looks good!
desertfire
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LTC C++ part looks good to me.
Summary: Also enables bazel build to run lazy codegen. Bazel (oss) build feeds off the same filelists as cmake/buck (build_variables.bzl), so enabling it is easier than keeping it disabled. Pull Request resolved: #74111 Test Plan: Run CI and verify test_lazy_ops is running via OSS cmake builds Reviewed By: bdhirsh Differential Revision: D34772403 fbshipit-source-id: 8a63f58b9536e6ac1be530667932176ef2549496
Summary: Also enables bazel build to run lazy codegen. Bazel (oss) build feeds off the same filelists as cmake/buck (build_variables.bzl), so enabling it is easier than keeping it disabled. Pull Request resolved: #74111 Test Plan: Run CI and verify test_lazy_ops is running via OSS cmake builds Reviewed By: bdhirsh Differential Revision: D34772403 fbshipit-source-id: 8a63f58b9536e6ac1be530667932176ef2549496 (cherry picked from commit e807ffb)
Differential Revision: D34772403