[XPU][Fix] Register convolution_overrideable for flops count by Stonepia · Pull Request #166839 · pytorch/pytorch

Stonepia · 2025-11-03T06:26:22Z

Register convolution_overrideable key for flop_counter. CUDA relies on keys with cudnn_convolution. For devices like XPU, it falls to convolution_overrideable. Without the correct registration, the flop_couter will silently return 0 for XPU in line:

pytorch/torch/_inductor/analysis/profile_analysis.py

Lines 178 to 179 in e1d011d

if op_obj is None or op_obj not in flop_registry:

return 0
Enable the tests when enabling the XPU on test_analysis.py.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @gujinghui @fengyuan14 @guangyey

pytorch-bot · 2025-11-03T06:26:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166839

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ROCm failures during provisioning step due to network issues

✅ No Failures

As of commit 4ef8cb5 with merge base 392acee ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Stonepia · 2025-11-03T06:26:37Z

@pytorchbot label "module: xpu"

Stonepia · 2025-11-03T06:39:04Z

@pytorchbot label "topic: not user facing"

guangyey · 2025-11-03T07:35:37Z

test/inductor/test_analysis.py

+        "Requires XPU or CUDA SM80",
+    )
+    @skipXPUIf(TEST_WITH_SLOW, "Skip because test too slow on XPU")
    @dtypes(torch.float, torch.float16)


Should we add convolution_overrideable in line 479?

Well, the logic already have the or statements, so we don't need to explicitly add it. But yeah, adding it seems more readable.

if name.startswith( ( "aten::cudnn_convolution", "aten::convolution", "aten::_convolution", ) ) or "conv" in name

Added in f583ecf

pytorch-bot · 2025-11-03T07:35:57Z

To add the ciflow label ciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

guangyey · 2025-11-04T05:53:38Z

@pytorchbot merge

pytorchmergebot · 2025-11-04T05:55:45Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Fixes #166838 1. Register `convolution_overrideable` key for flop_counter. CUDA relies on keys with `cudnn_convolution`. For devices like `XPU`, it falls to `convolution_overrideable`. Without the correct registration, the flop_couter will silently return 0 for XPU in line: https://github.com/pytorch/pytorch/blob/e1d011d6eb571cd98ec7c7ed8e8b518a5463ec97/torch/_inductor/analysis/profile_analysis.py#L178-L179 2. Enable the tests when enabling the XPU on `test_analysis.py`. Pull Request resolved: #166839 Approved by: https://github.com/guangyey, https://github.com/EikanWang, https://github.com/jansel

This PR enables XPU devices in test_analysis.py. For performance reason, it skips some slow tests, so a full scope should be enabled by using: ``` export PYTORCH_TEST_WTH_SLOW=1 ``` **PR Stack:** - #166840 : This PR enables the tests, ignores the tests that failed - #166839 : This fixed the bug and enable the full tests for xpu **Some skipped test time:** ``` test_augment_trace_against_flop_counter_maxat0_xpu_float16 [49.0863s] test_augment_trace_against_flop_counter_maxat0_xpu_float32 [18.2268s] test_augment_trace_against_flop_counter_maxat1_xpu_float16 [85.6549s] test_augment_trace_against_flop_counter_maxat1_xpu_float32 [329.0832s] test_augment_trace_against_flop_counter_maxat2_xpu_float16 [24.4825s] test_augment_trace_against_flop_counter_maxat2_xpu_float32 [19.0688s] ``` Pull Request resolved: #166840 Approved by: https://github.com/guangyey, https://github.com/jansel

…6840) This PR enables XPU devices in test_analysis.py. For performance reason, it skips some slow tests, so a full scope should be enabled by using: ``` export PYTORCH_TEST_WTH_SLOW=1 ``` **PR Stack:** - pytorch#166840 : This PR enables the tests, ignores the tests that failed - pytorch#166839 : This fixed the bug and enable the full tests for xpu **Some skipped test time:** ``` test_augment_trace_against_flop_counter_maxat0_xpu_float16 [49.0863s] test_augment_trace_against_flop_counter_maxat0_xpu_float32 [18.2268s] test_augment_trace_against_flop_counter_maxat1_xpu_float16 [85.6549s] test_augment_trace_against_flop_counter_maxat1_xpu_float32 [329.0832s] test_augment_trace_against_flop_counter_maxat2_xpu_float16 [24.4825s] test_augment_trace_against_flop_counter_maxat2_xpu_float32 [19.0688s] ``` Pull Request resolved: pytorch#166840 Approved by: https://github.com/guangyey, https://github.com/jansel

Stonepia added 2 commits November 3, 2025 13:41

Register convolution_overrideable for flops count

f583ecf

Update flop_counter.py

65f94ae

pytorch-bot bot added the module: inductor label Nov 3, 2025

pytorch-bot bot added the module: xpu Intel XPU related issues label Nov 3, 2025

Stonepia changed the title ~~[XPU][Bug] Register convolution_overrideable for flops count~~ [XPU][Fix] Register convolution_overrideable for flops count Nov 3, 2025

pytorchbot added the open source label Nov 3, 2025

Stonepia mentioned this pull request Nov 3, 2025

[XPU][Test] Enable XPU tests in inductor/test_analysis.py #166840

Closed

pytorch-bot bot added the topic: not user facing topic category label Nov 3, 2025

Update test_analysis.py

767f5ad

Stonepia marked this pull request as ready for review November 3, 2025 06:56

pytorch-bot bot added the ciflow/inductor label Nov 3, 2025

guangyey reviewed Nov 3, 2025

View reviewed changes

guangyey added this to PyTorch Intel Nov 3, 2025

guangyey added the ciflow/xpu Run XPU CI tasks label Nov 3, 2025

pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Nov 3, 2025

guangyey added ciflow/trunk Trigger trunk jobs on your pull request ciflow/xpu Run XPU CI tasks labels Nov 3, 2025

guangyey approved these changes Nov 3, 2025

View reviewed changes

Add convolution_overrideable for better code reading

4ef8cb5

pytorch-bot bot removed ciflow/trunk Trigger trunk jobs on your pull request ciflow/inductor ciflow/xpu Run XPU CI tasks labels Nov 3, 2025

guangyey added the ciflow/xpu Run XPU CI tasks label Nov 3, 2025

guangyey requested a review from jansel November 3, 2025 08:13

guangyey added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 3, 2025

pytorch-bot bot added the ciflow/inductor label Nov 3, 2025

EikanWang approved these changes Nov 4, 2025

View reviewed changes

jansel approved these changes Nov 4, 2025

View reviewed changes

pytorchmergebot added the merging label Nov 4, 2025

pytorchmergebot added the Merged label Nov 4, 2025

pytorchmergebot closed this in 3232caa Nov 4, 2025

github-project-automation bot moved this to Done in PyTorch Intel Nov 4, 2025

pytorchmergebot removed the merging label Nov 4, 2025

Stonepia deleted the tong/conv_flop branch November 4, 2025 05:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[XPU][Fix] Register convolution_overrideable for flops count#166839

[XPU][Fix] Register convolution_overrideable for flops count#166839
Stonepia wants to merge 4 commits intopytorch:mainfrom
Stonepia:tong/conv_flop

Stonepia commented Nov 3, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 3, 2025 •

edited

Loading

Uh oh!

Stonepia commented Nov 3, 2025

Uh oh!

Stonepia commented Nov 3, 2025

Uh oh!

guangyey Nov 3, 2025

Uh oh!

Stonepia Nov 3, 2025

Uh oh!

Stonepia Nov 3, 2025

Uh oh!

pytorch-bot bot commented Nov 3, 2025

Uh oh!

guangyey commented Nov 4, 2025

Uh oh!

pytorchmergebot commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	if op_obj is None or op_obj not in flop_registry:
	return 0

Conversation

Stonepia commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166839

❗ 1 Active SEVs

✅ No Failures

Uh oh!

Stonepia commented Nov 3, 2025

Uh oh!

Stonepia commented Nov 3, 2025

Uh oh!

guangyey Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Stonepia Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Stonepia Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

pytorch-bot bot commented Nov 3, 2025

Uh oh!

guangyey commented Nov 4, 2025

Uh oh!

pytorchmergebot commented Nov 4, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Stonepia commented Nov 3, 2025 •

edited

Loading

pytorch-bot bot commented Nov 3, 2025 •

edited

Loading