Skip to content

[aoti] AOTI mingw cross compilation #163188

Closed
yushangdi wants to merge 5 commits intomainfrom
aoti_windows_mingw_2
Closed

[aoti] AOTI mingw cross compilation #163188
yushangdi wants to merge 5 commits intomainfrom
aoti_windows_mingw_2

Conversation

@yushangdi
Copy link
Contributor

@yushangdi yushangdi commented Sep 17, 2025

To run this, you need to install mingw64-gcc-c++ and download windows cuda library toolkit.

See design doc and demo instructions in https://docs.google.com/document/d/1iDaChqA5nNKkBFTzsdkmoomvQlXHbnlb1Z4yEp7xaJA/edit?tab=t.0

If cross_platform_target is windows, we do the following:

  • do not link to sleef. This can be improved in the future if we need it. Currently I avoid it because that requires extra setup on the linux side
  • Use mingw64-gcc-c++ to compile
  • Use WINDOWS_CUDA_HOME instead of CUDA_HOME when linking to cuda
 python test/inductor/test_aot_inductor_windows.py -k so

Other changes:

  • de-couples compile_standalone config and dynamic link flag
  • create a new aot_inductor_mode config module, which is used to control configs in aot_inductor.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 17, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163188

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Cancelled Job

As of commit c9d7065 with merge base 232dd65 (image):

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

self.prefix.splice(
f"""
if ((long({input_name}.data_ptr()) & ({GPU_ALIGN_BYTES} -1)) != 0) {{
if ((reinterpret_cast<std::uintptr_t>({input_name}.data_ptr()) & ({GPU_ALIGN_BYTES} -1)) != 0) {{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has to change this for windows cross-compilation. This should also work for linux.

@yushangdi yushangdi force-pushed the aoti_windows_mingw_2 branch 2 times, most recently from 8b8abf2 to 0436a69 Compare September 17, 2025 22:39
#pragma once
#ifdef _WIN32
#include <Windows.h>
#include <windows.h>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has to use lower case for cross-compilation. windows is not case-sensitive, but linux is

@yushangdi yushangdi marked this pull request as ready for review September 18, 2025 17:49
@yushangdi yushangdi changed the title mingw cross compilation v2 [aoti] AOTI mingw cross compilation Sep 18, 2025
@ezyang
Copy link
Contributor

ezyang commented Sep 21, 2025

Neat! Who is using AOTI on Windows? Can you show evidence that this is running on our Windows CI? Thanks!

(Not a full review, deferring to AOTI peeps)

@angelayi angelayi requested a review from xuhancn September 22, 2025 01:10
@yushangdi
Copy link
Contributor Author

yushangdi commented Sep 22, 2025

Neat! Who is using AOTI on Windows? Can you show evidence that this is running on our Windows CI? Thanks!

(Not a full review, deferring to AOTI peeps)

@ezyang This is for Executorch (aka limited unified runtime) to use AOTI as a backend for windows. I haven't added this to windows CI yet, but that's next step!

@mergennachin mergennachin self-requested a review September 23, 2025 22:55
return x


class TestAOTInductorWindowsCrossCompilation(TestCase):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really not sure how to test this cross compilation workflow in CI.

@seemethere for context: we build a binary on linux with mingw and then run it on windows.
any recommendation on how to test that?

The only thing I can think of would be to have a two workflows one after the other, but that might be a lot of setup work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have WSL on the windows CI, we can build this in WSL, and then run the rest on windows. This is how I'm testing it locally as well.

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!
We can follow up with Eli on the testing.

I will let Bin approve this one though since I might have missed some things here.

_IS_LINUX = sys.platform.startswith("linux")
_IS_MACOS = sys.platform.startswith("darwin")
_IS_WINDOWS = sys.platform == "win32"
AOTI_SHIM_LIB = os.environ.get("AOTI_SHIM_LIB") # used for AOTI cross-compilation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this naming does not distinguish from config.aot_inductor.aoti_shim_library.

Copy link
Contributor Author

@yushangdi yushangdi Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe AOTI_SHIM_LIBRARY_PATH?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better. Also why don't we have a corresponding config for this one. Ad hoc environment variable makes it hard for users to discover.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense. I updated this to a config now.

@desertfire desertfire self-requested a review September 26, 2025 00:12
@yushangdi
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 29, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@yushangdi
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@yushangdi
Copy link
Contributor Author

@pytorchbot merge -i

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged while ignoring the following 1 checks: Lint / lintrunner-clang / linux-job

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Oct 21, 2025
To run this, you need to install `mingw64-gcc-c++` and download windows cuda library toolkit.

See design doc and demo instructions in https://docs.google.com/document/d/1iDaChqA5nNKkBFTzsdkmoomvQlXHbnlb1Z4yEp7xaJA/edit?tab=t.0

If cross_platform_target is windows, we do the following:

- do not link to `sleef`. This can be improved in the future if we need it. Currently I avoid it because that requires extra setup on the linux side
- Use `mingw64-gcc-c++` to compile
- Use `WINDOWS_CUDA_HOME` instead of `CUDA_HOME` when linking to cuda

```
 python test/inductor/test_aot_inductor_windows.py -k so
 ```

 Other changes:
 - de-couples compile_standalone config and dynamic link flag
 - create a new aot_inductor_mode config module, which is used to control configs in aot_inductor.

Pull Request resolved: pytorch#163188
Approved by: https://github.com/desertfire
@github-actions github-actions bot deleted the aoti_windows_mingw_2 branch November 1, 2025 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants