Skip to content

[ROCm][inductor] heuristic improvements for reduction kernels#161280

Closed
naromero77amd wants to merge 8 commits intopytorch:mainfrom
ROCm:mi350_perf_tuning
Closed

[ROCm][inductor] heuristic improvements for reduction kernels#161280
naromero77amd wants to merge 8 commits intopytorch:mainfrom
ROCm:mi350_perf_tuning

Conversation

@naromero77amd
Copy link
Collaborator

@naromero77amd naromero77amd commented Aug 22, 2025

@pytorch-bot
Copy link

pytorch-bot bot commented Aug 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161280

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit bb52a1d with merge base d74f9ec (image):

NEW FAILURE - The following job has failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added module: inductor module: rocm AMD GPU support for Pytorch labels Aug 22, 2025
@naromero77amd naromero77amd marked this pull request as draft August 22, 2025 16:59
@pytorch-bot pytorch-bot bot added ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm labels Aug 22, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Aug 22, 2025

To add the ciflow label ciflow/inductor please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-bot
Copy link

pytorch-bot bot commented Aug 22, 2025

To add the ciflow label ciflow/rocm please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-bot pytorch-bot bot removed ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm labels Aug 22, 2025
@naromero77amd naromero77amd added the release notes: rocm mandatorylabel label Aug 22, 2025
@facebook-github-bot
Copy link
Contributor

@haoyuz has imported this pull request. If you are a Meta employee, you can view this in D80836468.

@facebook-github-bot
Copy link
Contributor

@haoyuz has imported this pull request. If you are a Meta employee, you can view this in D80836468.

@naromero77amd naromero77amd changed the title [WIP][inductor][ROCm] MI350 perf tuning [WIP][inductor][ROCm] MI350 reduction heuristics improvements Sep 11, 2025
@naromero77amd naromero77amd marked this pull request as ready for review September 11, 2025 22:24
@naromero77amd naromero77amd marked this pull request as draft September 11, 2025 22:24
@facebook-github-bot
Copy link
Contributor

@haoyuz has imported this pull request. If you are a Meta employee, you can view this in D80836468.

@jataylo
Copy link
Collaborator

jataylo commented Sep 23, 2025

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict pull/161280/head returned non-zero exit code 1

Rebasing (1/18)
Auto-merging torch/_inductor/codegen/triton.py
Auto-merging torch/_inductor/runtime/triton_heuristics.py
CONFLICT (content): Merge conflict in torch/_inductor/runtime/triton_heuristics.py
error: could not apply d1e5254957a... Added triton perf improvement changes
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Could not apply d1e5254957a... # Added triton perf improvement changes

Raised by https://github.com/pytorch/pytorch/actions/runs/17948121352

@naromero77amd naromero77amd changed the title [WIP][inductor][ROCm] MI350 reduction heuristics improvements [inductor][ROCm] MI350 reduction heuristics improvements Sep 26, 2025
@pytorch-bot pytorch-bot bot added ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm labels Sep 26, 2025
@naromero77amd naromero77amd changed the title [inductor][ROCm] MI350 reduction heuristics improvements [ROCm][inductor] MI350 reduction heuristics improvements Sep 26, 2025
@pytorch-bot pytorch-bot bot removed ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm labels Sep 27, 2025
@naromero77amd naromero77amd added ciflow/inductor-rocm Trigger "inductor" config CI on ROCm ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm and removed ciflow/trunk Trigger trunk jobs on your pull request ciflow/inductor ciflow/inductor-rocm Trigger "inductor" config CI on ROCm labels Dec 16, 2025
@pytorch-bot pytorch-bot bot added the ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 label Dec 16, 2025
@naromero77amd
Copy link
Collaborator Author

Resolved conflict and will try to merge.

@naromero77amd
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 19, 2025
@naromero77amd
Copy link
Collaborator Author

@pytorchbot rebase

pytorchmergebot pushed a commit that referenced this pull request Dec 20, 2025
Improvements to reduction kernel heuristics for MI350.

Contributions from several members of the AMD Inductor and Triton teams: @jataylo @iupaikov-amd @AmdSampsa @xiaohuguo2023

**Duplicate of this PR:** #161280 (which has already been approved multiple times, but we are unable to merge due to some Meta Internal Check that cannot be cleared).

Pull Request resolved: #170931
Approved by: https://github.com/jeffdaily
@naromero77amd
Copy link
Collaborator Author

Duplicate PR here was landed: #170931

FWIW, I think the real issue might have been the pytorchbot's token not confirming to AMD security policy.

xgz2 pushed a commit that referenced this pull request Dec 22, 2025
Improvements to reduction kernel heuristics for MI350.

Contributions from several members of the AMD Inductor and Triton teams: @jataylo @iupaikov-amd @AmdSampsa @xiaohuguo2023

**Duplicate of this PR:** #161280 (which has already been approved multiple times, but we are unable to merge due to some Meta Internal Check that cannot be cleared).

Pull Request resolved: #170931
Approved by: https://github.com/jeffdaily
krastogi-in pushed a commit to krastogi-in/pytorch that referenced this pull request Jan 9, 2026
…h#170931)

Improvements to reduction kernel heuristics for MI350.

Contributions from several members of the AMD Inductor and Triton teams: @jataylo @iupaikov-amd @AmdSampsa @xiaohuguo2023

**Duplicate of this PR:** pytorch#161280 (which has already been approved multiple times, but we are unable to merge due to some Meta Internal Check that cannot be cleared).

Pull Request resolved: pytorch#170931
Approved by: https://github.com/jeffdaily
naromero77amd added a commit to ROCm/pytorch that referenced this pull request Jan 23, 2026
…h#170931)

Improvements to reduction kernel heuristics for MI350.

Contributions from several members of the AMD Inductor and Triton teams: @jataylo @iupaikov-amd @AmdSampsa @xiaohuguo2023

**Duplicate of this PR:** pytorch#161280 (which has already been approved multiple times, but we are unable to merge due to some Meta Internal Check that cannot be cleared).

Pull Request resolved: pytorch#170931
Approved by: https://github.com/jeffdaily

(cherry picked from commit 5eceb87)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/inductor-rocm Trigger "inductor" config CI on ROCm ciflow/rocm Trigger "default" config CI on ROCm ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 ciflow/trunk Trigger trunk jobs on your pull request keep-going Don't stop on first failure, keep running tests until the end Merged module: inductor module: rocm AMD GPU support for Pytorch open source release notes: inductor release notes: rocm mandatorylabel triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.