[AOTI] Refine the C shim autogen mechanism #125589

desertfire · 2024-05-06T14:22:08Z

Stack from ghstack (oldest at bottom):

-> [AOTI] Refine the C shim autogen mechanism #125589

Summary: Based on the discussions in #120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will re-generate C shim files and compare the header file contents against the existing C shim headers. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run.

Differential Revision: D57004046

cc @albanD @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @chauhang

Summary: Based on the discussion in #120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will generate C shim files in a tmp location and compare the header files against the existing header files. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run [ghstack-poisoned]

pytorch-bot · 2024-05-06T14:22:12Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125589

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 395be7b with merge base b37bef9 ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

inductor / rocm6.1-py3.8-inductor / test (inductor, 1, 1, linux.rocm.gpu.2) (gh) (trunk failure)
test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard2DTraining::test_train_parity_2d_mlp

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: Based on the discussions in #120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will generate C shim files in a tmp location and compare the header files against the existing header files. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run [ghstack-poisoned]

Summary: Based on the discussions in #120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will re-generate C shim files and compare the header file contents against the existing C shim headers. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run. [ghstack-poisoned]

desertfire · 2024-05-06T16:42:31Z

@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: Based on the discussions in #120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will re-generate C shim files and compare the header file contents against the existing C shim headers. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run. Differential Revision: [D57004046](https://our.internmc.facebook.com/intern/diff/D57004046) cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames chauhang [ghstack-poisoned]

desertfire · 2024-05-06T18:03:36Z

@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: Based on the discussions in #120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will re-generate C shim files and compare the header file contents against the existing C shim headers. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run. Differential Revision: [D57004046](https://our.internmc.facebook.com/intern/diff/D57004046) cc albanD voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames chauhang [ghstack-poisoned]

desertfire · 2024-05-06T18:32:34Z

@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

desertfire · 2024-05-06T18:34:04Z

@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

albanD · 2024-05-06T19:09:32Z

Why is it ok to skip sanity check here?
Also why are these generated files checked into the PT repo?

Summary: Based on the discussions in #120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will re-generate C shim files and compare the header file contents against the existing C shim headers. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run. Differential Revision: [D57004046](https://our.internmc.facebook.com/intern/diff/D57004046) cc albanD voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames chauhang [ghstack-poisoned]

ezyang · 2024-05-08T14:54:48Z

Looking over the PR, I am wondering if we should only check in the header file, and allow the cpp file to be continuously generated. The reason being that the header is the ABI and what has to be stable, whereas the cpp file is implementation details and can change (and would definitely change if we, e.g., changed the details of how tensor handles worked).

There is a cost to doing it this way: if ABI evolves, we are obligated to keep the codegen around for both the old style headers and the new style headers. So you essentially can never stop maintaining the old codegen, it has to be continuously maintained. But maybe this is a price we should be willing to pay, for optionality in how the cpp implementation proceeds. It would certainly make @albanD happier with less lines of code checked in.

I'm also kind of wondering if we need the linalg functions, but this is less a big deal if we aren't checking in the cpp.

Summary: Based on the discussions in #120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will re-generate C shim files and compare the header file contents against the existing C shim headers. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run. Differential Revision: [D57004046](https://our.internmc.facebook.com/intern/diff/D57004046) cc albanD voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames chauhang [ghstack-poisoned]

albanD

Thanks for the update. definitely less scary with only headers being shipped here as a way to track BC changes.

One thing I would mention is that you want to make sure that your explanation on what to do when changes in native/fallback op happen. Otherwise, it is quite likely that devs will just "whatever silences" that error. Which might actually break your ABI without you seeing it.

albanD · 2024-05-08T16:34:34Z

torchgen/gen.py

+fallback op in a file, e.g. torch/csrc/inductor/aoti_torch/c/shim.h, bump up the version
+number of that fallback op in the newly generated C shim files, and update the cpp wrapper
+codegen to generate the correct cpp call for this op. Contact AOTInductor team for assistance.


Can you put the full locations from the files here? Maybe even using new_header/old_header variable to get the location for the current user.

I agree this is a little involved with a lot of manual work. But for now, I will just kick the can down the road.

Sure, but the people impacted by this are all the other devs. So I think we should have some rule that if more than 3 different people are confused and open issues/post in groups asking what to do with this. Then must do something about it. That sounds fair?

albanD · 2024-05-08T16:35:22Z

torchgen/gen.py

+1. You added a fallback op to the inductor_fallback_ops list in torchgen/aoti/fallback_ops.py.
+If that's the case, run `python torchgen/gen.py --update-aoti-c-shim` to update the existing
+C shim header files.


Can we set the right CMake dependencies here for this to happen automatically?

I actually think it is dangerous to make it automatic, because it is hard to tell what is a "safe" change to torchgen/aoti/fallback_ops.py. E.g. someone could remove a line from the file, and it is hard for CMake to tell that it is BC-breakage.

Summary: Based on the discussions in #120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will re-generate C shim files and compare the header file contents against the existing C shim headers. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run. Differential Revision: [D57004046](https://our.internmc.facebook.com/intern/diff/D57004046) cc albanD voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames chauhang [ghstack-poisoned]

Summary: Based on the discussions in #120513. Instead of auto-generate C shim fallback ops for thousands of ops, we maintain a list of fallback ops based on torch/_inductor/lowering.py, and only generate C shim functions for those ops. At the torchgen time, we will re-generate C shim files and compare the header file contents against the existing C shim headers. If there is any change, the compilation will fail with prompt on how to proceed. This makes sure the ABI-compatible C shim layer is small enough to maintain in the long run. ghstack-source-id: b66a94b Pull Request resolved: #125589

desertfire · 2024-05-08T20:40:33Z

@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

ezyang · 2024-05-09T00:22:41Z

torch/csrc/inductor/aoti_torch/generated/c_shim_cpu.h

@@ -0,0 +1,112 @@
+
+// WARNING: THIS FILE IS AUTOGENERATED BY torchgen. DO NOT MODIFY BY HAND.


Need to say how to update this file, no? Because it doesn't get updated as part of the build process?

ezyang · 2024-05-09T00:26:17Z

torchgen/aoti/fallback_ops.py

@@ -0,0 +1,129 @@
+# This list is based on the fallback ops from torch/_inductor/lowering.py
+# If you add a new op to the list, remember to run `python torchgen/gen.py --update-aoti-c-shim`
+# to update C shim files.


There's more, right? You also have to inspect that the diff is not ABI breaking

ezyang · 2024-05-09T00:27:07Z

torchgen/gen.py

+        "--update-aoti-c-shim",
+        action="store_true",
+        help="Update AOTInductor C shim after changing torchgen/aoti/fallback_ops.py. "
+        "WARNING: Do not use this unless you are sure what you are doing!!!",


lol, can you just link to the instructions lol

facebook-github-bot · 2024-05-09T02:46:30Z

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

pytorchmergebot · 2024-05-09T02:48:07Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Summary: Make some improvements to #125589 * Add a .default suffix to default ops in fallback_ops.py, to make it clear that those are OpOverload. * Update warnings and comments based on feedbacks to #125589 [ghstack-poisoned]

Summary: Make some improvements to #125589 * Add a .default suffix to default ops in fallback_ops.py, to make it clear that those are OpOverload. * Update warnings and comments based on feedbacks to #125589 Pull Request resolved: #125928 Approved by: https://github.com/angelayi ghstack dependencies: #125291, #125730, #125731

Summary: Make some improvements to pytorch#125589 * Add a .default suffix to default ops in fallback_ops.py, to make it clear that those are OpOverload. * Update warnings and comments based on feedbacks to pytorch#125589 Pull Request resolved: pytorch#125928 Approved by: https://github.com/angelayi ghstack dependencies: pytorch#125291, pytorch#125730, pytorch#125731

pytorch-bot bot added ciflow/inductor module: inductor labels May 6, 2024

desertfire added suppress-bc-linter Suppresses the failures of API backward-compatibility linter (Lint/bc_linter) topic: not user facing topic category labels May 6, 2024

desertfire added the skip-pr-sanity-checks label May 6, 2024

desertfire mentioned this pull request May 6, 2024

aoti_torch_cpu_cumsum is missing while aoti_torch_cpu_cumsum_out exists #123050

Closed

albanD approved these changes May 8, 2024

View reviewed changes

desertfire removed the skip-pr-sanity-checks label May 8, 2024

ezyang reviewed May 9, 2024

View reviewed changes

ezyang approved these changes May 9, 2024

View reviewed changes

pytorchmergebot added the merging label May 9, 2024

pytorchmergebot closed this in ed48ea9 May 9, 2024

pytorchmergebot added Merged and removed merging labels May 9, 2024

desertfire mentioned this pull request May 10, 2024

[AOTI][torchgen] Minor improvements to C shim torchgen #125928

Closed

desertfire mentioned this pull request May 13, 2024

Enable AOTI shim v2 build and add into libtorch #125211

Closed

github-actions bot deleted the gh/desertfire/376/head branch June 9, 2024 01:58

		@@ -0,0 +1,112 @@

		// WARNING: THIS FILE IS AUTOGENERATED BY torchgen. DO NOT MODIFY BY HAND.

[AOTI] Refine the C shim autogen mechanism #125589

[AOTI] Refine the C shim autogen mechanism #125589

Uh oh!

Conversation

desertfire commented May 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125589

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

desertfire commented May 6, 2024

Uh oh!

desertfire commented May 6, 2024

Uh oh!

desertfire commented May 6, 2024

Uh oh!

desertfire commented May 6, 2024

Uh oh!

albanD commented May 6, 2024

Uh oh!

ezyang commented May 8, 2024

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

desertfire commented May 8, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented May 9, 2024

Uh oh!

pytorchmergebot commented May 9, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

desertfire commented May 6, 2024 •

edited

Loading

pytorch-bot bot commented May 6, 2024 •

edited

Loading