Skip to content

[Triton] [Inductor] Pruned failed compilations from Autotuning candidates#162673

Closed
njriasan wants to merge 1 commit intopytorch:mainfrom
njriasan:export-D82172207
Closed

[Triton] [Inductor] Pruned failed compilations from Autotuning candidates#162673
njriasan wants to merge 1 commit intopytorch:mainfrom
njriasan:export-D82172207

Conversation

@njriasan
Copy link
Contributor

@njriasan njriasan commented Sep 11, 2025

Summary:
When exahaustively autotuning a new template you may hit situations that lead to compilation failures. This template will still attempt to autotune because nothing was marking this as failed and in my experiments lead to a crash/segfault if I didn't set TORCHINDUCTOR_AUTOTUNE_IN_SUBPROC=1.

To help eliminate this issue this PR marks any template that fails to compile as "failed" and then removes all of the failed templates from the choice candidates. In the case where it would have just failed to compile twice, this should at least reduce compilation time.

Test Plan:
Tested locally when experminenting with the new blackwell templates and a Triton version that contains a bug related to num_warps < 4.

Rollback Plan:

Differential Revision: D82172207

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @mlazos

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 11, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162673

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 28907b4 with merge base f654cff (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D82172207

njriasan added a commit to njriasan/pytorch that referenced this pull request Sep 11, 2025
…ates (pytorch#162673)

Summary:

When exahaustively autotuning a new template you may hit situations that lead to compilation failures. This template will still attempt to autotune because nothing was marking this as failed and in my experiments lead to a crash/segfault if I didn't set `TORCHINDUCTOR_AUTOTUNE_IN_SUBPROC=1`.

To help eliminate this issue this PR marks any template that fails to compile as "failed" and then removes all of the failed templates from the choice candidates. In the case where it would have just failed to compile twice, this should at least reduce compilation time.

Test Plan:
Tested locally when experminenting with the new blackwell templates and a Triton version that contains a bug related to `num_warps < 4`.

Rollback Plan:

Differential Revision: D82172207
…ates (pytorch#162673)

Summary:

When exahaustively autotuning a new template you may hit situations that lead to compilation failures. This template will still attempt to autotune because nothing was marking this as failed and in my experiments lead to a crash/segfault if I didn't set `TORCHINDUCTOR_AUTOTUNE_IN_SUBPROC=1`.

To help eliminate this issue this PR marks any template that fails to compile as "failed" and then removes all of the failed templates from the choice candidates. In the case where it would have just failed to compile twice, this should at least reduce compilation time.

Test Plan:
Tested locally when experminenting with the new blackwell templates and a Triton version that contains a bug related to `num_warps < 4`.

Rollback Plan:

Differential Revision: D82172207
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D82172207

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D82172207

Copy link
Contributor

@PaulZhang12 PaulZhang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an awesome fix! Thank you, cc @eellison for any thoughts

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 11, 2025
@njriasan
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

else "max_autotune_conv_backends"
)
raise NoValidChoicesError(
return NoValidChoicesError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to modify this to indicate the reason for no valid choices, I think this could be a good idea (e.g. no compileable choices vs no choices at the beginning)

I had some plans to update this since this is the most common error I've seen with users by far. They usually end up adding aten, but it would be useful to know why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to submit a followup. Thanks for the suggestion.

markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…ates (pytorch#162673)

Summary:
When exahaustively autotuning a new template you may hit situations that lead to compilation failures. This template will still attempt to autotune because nothing was marking this as failed and in my experiments lead to a crash/segfault if I didn't set `TORCHINDUCTOR_AUTOTUNE_IN_SUBPROC=1`.

To help eliminate this issue this PR marks any template that fails to compile as "failed" and then removes all of the failed templates from the choice candidates. In the case where it would have just failed to compile twice, this should at least reduce compilation time.

Test Plan:
Tested locally when experminenting with the new blackwell templates and a Triton version that contains a bug related to `num_warps < 4`.

Rollback Plan:

Differential Revision: D82172207

Pull Request resolved: pytorch#162673
Approved by: https://github.com/PaulZhang12, https://github.com/mlazos
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
…ates (pytorch#162673)

Summary:
When exahaustively autotuning a new template you may hit situations that lead to compilation failures. This template will still attempt to autotune because nothing was marking this as failed and in my experiments lead to a crash/segfault if I didn't set `TORCHINDUCTOR_AUTOTUNE_IN_SUBPROC=1`.

To help eliminate this issue this PR marks any template that fails to compile as "failed" and then removes all of the failed templates from the choice candidates. In the case where it would have just failed to compile twice, this should at least reduce compilation time.

Test Plan:
Tested locally when experminenting with the new blackwell templates and a Triton version that contains a bug related to `num_warps < 4`.

Rollback Plan:

Differential Revision: D82172207

Pull Request resolved: pytorch#162673
Approved by: https://github.com/PaulZhang12, https://github.com/mlazos
cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request Sep 22, 2025
…ates (pytorch#162673)

Summary:
When exahaustively autotuning a new template you may hit situations that lead to compilation failures. This template will still attempt to autotune because nothing was marking this as failed and in my experiments lead to a crash/segfault if I didn't set `TORCHINDUCTOR_AUTOTUNE_IN_SUBPROC=1`.

To help eliminate this issue this PR marks any template that fails to compile as "failed" and then removes all of the failed templates from the choice candidates. In the case where it would have just failed to compile twice, this should at least reduce compilation time.

Test Plan:
Tested locally when experminenting with the new blackwell templates and a Triton version that contains a bug related to `num_warps < 4`.

Rollback Plan:

Differential Revision: D82172207

Pull Request resolved: pytorch#162673
Approved by: https://github.com/PaulZhang12, https://github.com/mlazos
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
…ates (pytorch#162673)

Summary:
When exahaustively autotuning a new template you may hit situations that lead to compilation failures. This template will still attempt to autotune because nothing was marking this as failed and in my experiments lead to a crash/segfault if I didn't set `TORCHINDUCTOR_AUTOTUNE_IN_SUBPROC=1`.

To help eliminate this issue this PR marks any template that fails to compile as "failed" and then removes all of the failed templates from the choice candidates. In the case where it would have just failed to compile twice, this should at least reduce compilation time.

Test Plan:
Tested locally when experminenting with the new blackwell templates and a Triton version that contains a bug related to `num_warps < 4`.

Rollback Plan:

Differential Revision: D82172207

Pull Request resolved: pytorch#162673
Approved by: https://github.com/PaulZhang12, https://github.com/mlazos
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants