-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Add non-eager registration to dispatch autogen #74557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 835a272 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
|
This pull request was exported from Phabricator. Differential Revision: D35051464 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are all the changes here part of this PR? (or is there any chance you can split the eager changes into their own PR for review?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, i have been exporting these PRs from a stack in phabricator, and I didn't realize that the way github would render them is unusable. If you click on just the top commit shown here, it will show you the small set of changes that actually goes with this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"external backends" -> "everything that isn't LTC"? (Technically I think this applies both our eager backends in-tree like CPU/CUDA, and also any other backends out-of-tree like Intel XPU)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it was meant as in external LTC backends such as one developed out of tree. I should probably update it to say more clearly that only the in-tree lazy torchscript backend gets special behavior, and others don't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused by this split between DispatchKeyNativeFunctions.h and RegisterDispatchKey.cpp.
Don't we always end up #includeing DispatchKeyNativeFunctions.h in this file, so the Register${BackendName}${DispatchKey}Modules function will already be declared? (and the macro above will already be defined).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in practice this might be true. But its not explicitly clear just from the template definition that this is the case. Which is why I added the #ifndef clause. If it happens that there is a case (either now or in the future) where the DispatchKeyNativeFunctions.h is not included, it would avoid a compile error here and revert back to the default behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the "delay registration using a lambda" feels like a niche enough use case to me that we should try to implement it with less clutter to this file. Instead of using the macros, how do you feel about unconditionally generating the Register{...}Modules lambda? And still using ${eager_registration} to decide in the codegen whether or not we call it directly in the file or not (to run the registrations at load time).
The initTSBackend() function wouldn't need to check a macro - it would just always call the lambda to run the registrations. You might run into problems if we got the codegen wrong and accidentally registered the TS backend on startup, but you could also just add a unit test for this (assert in one of the C++ tests that there aren't any kernels registered to the Lazy key on startup).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that might not be a bad idea. But perhaps better left to a follow PR seeing as this is just a port of a previously merged PR?
Perhaps what could be done is if the Register{...}Modules function is called at c++ static initialization, we could also reset the value to nullptr so it can't be called again. Then we can just run the function only if it hasn't been called before. This would eliminate the need of the EAGER_REGISTRATION macro. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, that sounds reasonable.
I kind of don't want to clutter up this file with the extra macros, mostly because this is a pretty core template used in a bunch of different contexts (used for most of our eager mode dispatcher registrations too) and so anyone looking at this file for non-LTC purposes would now have to reason about the macros.
@wconstab would you rather make changes to this PR directly, or address feedback later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make the changes here. I already landed this as-is on the staging branch, but we are almost at the point of abandoning the staging branch so I think we can start to do things differently here instead of striving to stay in sync there.
It also depends how we make the changes; if you have some clear ideas of what i should change I can just make the changes. If you actually want to try to modify it yourself, I could either land it first and you change it, or maybe you could commandeer this diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what @antoniojkim put sounds reasonable:
(1) kill the macro and unconditionally generate the Register${BackendName}${DispatchKey}Modules();
(2) In the codegen, if the eager_registration argument is true, then emit the code to run the function immediately (which this PR already does), and then also set it to re-assign it to a lambda that no-ops (that way you won't accidentally call it and register things multiple times - although I think the risk of this being a problem in practice is pretty small, since nobody should ever call this function except for the torchscript one. So if this is hard to do then I think it's ok if you don't add it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically this should always be set to 1 for TS, right? Maybe we should just assert it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I think that's true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might just be from a different PR, but I think we technically don't need BackendName here? (the function name will be unique without it). Plumbing a new BackendName argument through the codegen for the name seems like overkill to me, but maybe there are other places in the codegen where BackendName actually is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bdhirsh This is something I added to differentiate between backends. There is no guarantee as far as I am aware as to the order in which the symbols are statically initialized. This means that if we had just a RegisterLazyModules function being defined by both the TS backend and a custom backend, then there is no way of knowing who assigned the value of that function pointer last. To avoid ambiguity, I plumbed through the backend name to make things clear and unambiguous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah good point, I can't think of an easy way to get around this. (cc @ezyang if you happen to have a different opinion. Context = this PR updates the codegen'd dispatcher registrations to be able to be delayed, and registered through a lambda instead of at load time).
|
This pull request was exported from Phabricator. Differential Revision: D35051464 |
e9bb86f to
641cab9
Compare
tools/codegen/gen_lazy_tensor.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wconstab I think there's a problem here. According to the function definition of gen_dispatchkey_nativefunc_headers, the positional argument after autograd_dispatch_key is eager_registration, but instead you passed in backend_name. Since you also set the value of eager_registration using a keyword, the following error occurs:
TypeError: gen_dispatchkey_nativefunc_headers() got multiple values for argument 'eager_registration'
|
This pull request was exported from Phabricator. Differential Revision: D35051464 |
641cab9 to
a8d05c7
Compare
a8d05c7 to
d995a74
Compare
|
This pull request was exported from Phabricator. Differential Revision: D35051464 |
Summary: Pull Request resolved: pytorch#74557 Previously, the torchscript backend would be (partially) initialized at startup. - the dispatcher registrations would be registered, - but other backend components would not be initialized until explicitly calling the backend init function With this change, the torchscript backend is not initialized until its explicit initialization function is called. This enables external backends to register their own backend instead of the torchscript backend to the same (Lazy) key. Lands a change contributed by antoniojkim via lazy_tensor_staging branch (pytorch#73973) Differential Revision: D35051464 fbshipit-source-id: 9bbf789e7cccbac73267568abdc178df2d9efcfa
d995a74 to
347edb0
Compare
|
This pull request was exported from Phabricator. Differential Revision: D35051464 |
|
|
||
| }; | ||
|
|
||
| ${call_register_dispatchkey_modules}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty invasive, I'm not sure I like it.
|
@wconstab this seems like a pretty circuitous way to do something that fundamentally should be relatively simple. The So I'd expect TORCH_LIBRARY to move into the codegen code, and then you switch between defining a function or using TORCH_LIBRARY, and then all of the extra faffing about with std::function goes away. |
347edb0 to
df198f8
Compare
tools/codegen/gen_backend_stubs.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can probably just use one argument here instead of two in the template file (dispatch_registrations), since right now one of these two arguments is always an empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I need to put the content in 2 different places, one place is inside anonymous namespace and the other is inside at:: namespace. Maybe I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yep, i forgot about that. makes sense!
tools/codegen/gen_backend_stubs.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit: most of our codegen uses f-strings now instead of CodeTemplate, e.g:
deferred_dispatch_registrations = f"""\
TORCH_API void Register{backend_name}{dispatch_key}NativeFunctions() {{
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it bad to use CodeTemplate though? I started off using f-strings, but since I am substituting a multi-line string (with indentation) CodeTemplate actually provides value here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think f-strings are pretty much superior to CodeTemplate (you can do multi-line f-strings just fine, like in the codegen here). It's not too big of a deal though so i don't wanna get hung up on it.
df198f8 to
47a121a
Compare
47a121a to
519221b
Compare
519221b to
9549095
Compare
9549095 to
ecdfd52
Compare
|
hmm I seem to have totally broken this (for the TS backend - other keys seem ok) I took the liberty of using 'MAKE_TORCH_LIBRARY_IMPL' instead of literally what @ezyang suggested, maybe I misunderstood. It seems like either nothing is getting registered (in time?) for TS, or maybe just the fallback is not working and that's breaking virtually all tests since we fall back for some factory function. It might matter what namespace the deferred registration function is defined in? |
|
Oops, MAKE_TORCH_LIBRARY_IMPL is the right call. I don't see anything obviously wrong, I guess I would stare at the diff between here and the previous working version |
|
@wconstab has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1b04b30 to
3236be6
Compare
|
@wconstab has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
test/cpp/lazy/test_lazy_ops.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like ASAN isn't happy (the ASAN test is complaining about initialization-order-fiasco: https://github.com/pytorch/pytorch/runs/5746632034?check_suite_focus=true)
I think the ordering problem might be between this static bool (which eventually calls RegisterTorchScriptLazyNativeFunctions), and the static library object that that function uses in the codegen: static auto m = MAKE_TORCH_LIBRARY_IMPL(aten, $dispatch_key);.
You might need to either access m through a function, or maybe not make this bool static?
Summary: Pull Request resolved: pytorch#74557 Previously, the torchscript backend would be (partially) initialized at startup. - the dispatcher registrations would be registered, - but other backend components would not be initialized until explicitly calling the backend init function With this change, the torchscript backend is not initialized until its explicit initialization function is called. This enables external backends to register their own backend instead of the torchscript backend to the same (Lazy) key. Lands a change contributed by antoniojkim via lazy_tensor_staging branch (pytorch#73973) Differential Revision: D35051464 fbshipit-source-id: 9bbf789e7cccbac73267568abdc178df2d9efcfa
part 1/2 Co-authored-by: Jae Hoon (Antonio) Kim <17433012+antoniojkim@users.noreply.github.com>
part 2/2 Co-authored-by: Jae Hoon (Antonio) Kim <17433012+antoniojkim@users.noreply.github.com>
9e8ffbd to
835a272
Compare
bdhirsh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
|
@wconstab has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: Previously, the torchscript backend would be (partially) initialized at startup. - the dispatcher registrations would be registered, - but other backend components would not be initialized until explicitly calling the backend init function With this change, the torchscript backend is not initialized until its explicit initialization function is called. This enables external backends to register their own backend instead of the torchscript backend to the same (Lazy) key. Lands a change contributed by antoniojkim via lazy_tensor_staging branch (#73973) Pull Request resolved: #74557 Reviewed By: bdhirsh Differential Revision: D35051464 Pulled By: wconstab fbshipit-source-id: 5a8b0851293e394f49427d1416ee571a8881fe9f
|
Hey @wconstab. |
Summary:
Previously, the torchscript backend would be (partially) initialized at startup.
the backend init function
With this change, the torchscript backend is not initialized until its explicit
initialization function is called.
This enables external backends to register their own backend instead of the torchscript
backend to the same (Lazy) key.
Lands a change contributed by @antoniojkim via lazy_tensor_staging branch (#73973)
Differential Revision: D35051464