[xpu] Support high stream for ProcessGroupXCCL by Chao1Han · Pull Request #163049 · pytorch/pytorch

Chao1Han · 2025-09-16T07:10:15Z

Add high priority stream support for ProcessGroupXCCL. Just like CUDA, XPU streams also support execution with higher priority compared to other streams. Implementation in intel/torch-xpu-ops#1715, add register here.

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @msaroufim @dcci

pytorch-bot · 2025-09-16T07:10:20Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163049

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 65d78d2 with merge base f2bb22f ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Support high priority stream for xccl, test case add in #2049 We need merge this pr first and upstream op register pytorch/pytorch#163049 and then test case could be pass --------- Co-authored-by: mengfei25 <mengfei.li@Intel.com>

Copilot

Pull Request Overview

This PR adds high priority stream support for ProcessGroupXCCL, bringing it in line with CUDA's stream priority capabilities. The implementation enables XPU streams to execute with higher priority compared to other streams.

Adds a new constructor overload for ProcessGroupXCCL that accepts store, rank, and size parameters with default low priority stream configuration
Extends the Options class to include is_high_priority_stream parameter with proper Python bindings
Provides read/write access to the high priority stream option through Python properties

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

torch/csrc/distributed/c10d/init.cpp

Support high priority stream for xccl, test case add in #2049 We need merge this pr first and upstream op register pytorch/pytorch#163049 and then test case could be pass --------- Co-authored-by: mengfei25 <mengfei.li@Intel.com>

guangyey · 2025-09-18T03:07:16Z

@Chao1Han You need to update torch-xpu-ops as well.

Chao1Han · 2025-09-18T03:10:50Z

@Chao1Han You need to update torch-xpu-ops as well.

Sure, let me update pin commit also here.

guangyey

LGTM.

pytorch-bot · 2025-09-18T03:28:46Z

To add the ciflow label ciflow/trunk please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

guangyey · 2025-09-18T03:30:17Z

I think you'd better update the pin in a separate PR.

Chao1Han · 2025-09-18T04:39:59Z

I think you'd better update the pin in a separate PR.

Sure, I will wait for the pin commit update before merging this PR.

Chao1Han · 2025-09-19T02:12:25Z

@pytorchbot rebase -b main

pytorch-bot · 2025-10-21T08:19:06Z

To add the ciflow label ciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

pytorch-bot · 2025-10-21T08:19:29Z

To add the ciflow label ciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

albanD

SGTM

Chao1Han · 2025-10-22T00:46:53Z

@pytorchmergebot merge

pytorchmergebot · 2025-10-22T00:48:40Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Add high priority stream support for ProcessGroupXCCL. Just like CUDA, XPU streams also support execution with higher priority compared to other streams. Implementation in intel/torch-xpu-ops#1715, add register here. Pull Request resolved: pytorch#163049 Approved by: https://github.com/guangyey, https://github.com/gujinghui, https://github.com/EikanWang, https://github.com/albanD

Feature #1715 and register pytorch/pytorch#163049 merged. add some high priority stream test case

After #163049, this PR fixes the type annotations to match the actual implementation for ProcessGroupXCCL::Options. Pull Request resolved: #166418 Approved by: https://github.com/guangyey, https://github.com/ezyang

After pytorch#163049, this PR fixes the type annotations to match the actual implementation for ProcessGroupXCCL::Options. Pull Request resolved: pytorch#166418 Approved by: https://github.com/guangyey, https://github.com/ezyang

pytorch-bot bot added oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: distributed (c10d) release notes category labels Sep 16, 2025

Chao1Han mentioned this pull request Sep 16, 2025

support high priority stream intel/torch-xpu-ops#1715

Merged

Chao1Han marked this pull request as draft September 16, 2025 07:13

pytorchbot added the open source label Sep 16, 2025

EikanWang requested a review from Copilot September 17, 2025 01:25

Copilot AI reviewed Sep 17, 2025

View reviewed changes

torch/csrc/distributed/c10d/init.cpp Outdated Show resolved Hide resolved

Chao1Han force-pushed the high_stream branch from 203f701 to b4f70bf Compare September 17, 2025 01:28

guangyey added this to PyTorch Intel Sep 18, 2025

guangyey moved this to Review Required in PyTorch Intel Sep 18, 2025

guangyey moved this from Review Required to Pre-Review Required in PyTorch Intel Sep 18, 2025

Chao1Han marked this pull request as ready for review September 18, 2025 03:15

Chao1Han requested review from EikanWang and gujinghui as code owners September 18, 2025 03:15

guangyey added the ciflow/xpu Run XPU CI tasks label Sep 18, 2025

guangyey approved these changes Sep 18, 2025

View reviewed changes

guangyey added release notes: xpu release notes category ciflow/trunk Trigger trunk jobs on your pull request labels Sep 18, 2025

pytorch-bot bot removed the ciflow/trunk Trigger trunk jobs on your pull request label Sep 18, 2025

gujinghui approved these changes Sep 18, 2025

View reviewed changes

pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Sep 18, 2025

guangyey added ciflow/xpu Run XPU CI tasks and removed NNC module: inductor module: dynamo module: compiled autograd compiled_autograd release notes: quantization release notes category release notes: inductor (aoti) labels Oct 21, 2025

pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Oct 21, 2025

guangyey added the ciflow/xpu Run XPU CI tasks label Oct 21, 2025

pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Oct 21, 2025

guangyey added ciflow/trunk Trigger trunk jobs on your pull request ciflow/xpu Run XPU CI tasks labels Oct 21, 2025

albanD approved these changes Oct 21, 2025

View reviewed changes

pytorchmergebot added the merging label Oct 22, 2025

pytorchmergebot added the Merged label Oct 22, 2025

pytorchmergebot closed this in a100542 Oct 22, 2025

github-project-automation bot moved this from Review Required to Done in PyTorch Intel Oct 22, 2025

pytorchmergebot removed the merging label Oct 22, 2025

Chao1Han mentioned this pull request Oct 27, 2025

add xccl high priority stream test intel/torch-xpu-ops#2049

Merged

github-merge-queue bot pushed a commit to intel/torch-xpu-ops that referenced this pull request Oct 27, 2025

add xccl high priority stream test (#2049)

91665cb

Feature #1715 and register pytorch/pytorch#163049 merged. add some high priority stream test case

frost-intel mentioned this pull request Oct 28, 2025

[xpu] Fix type annotation for ProcessGroupXCCL #166418

Closed

Conversation

Chao1Han commented Sep 16, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163049

✅ No Failures

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

guangyey commented Sep 18, 2025

Uh oh!

Chao1Han commented Sep 18, 2025

Uh oh!

guangyey left a comment

Choose a reason for hiding this comment

Uh oh!

pytorch-bot bot commented Sep 18, 2025

Uh oh!

guangyey commented Sep 18, 2025

Uh oh!

Chao1Han commented Sep 18, 2025

Uh oh!

Chao1Han commented Sep 19, 2025

Uh oh!

pytorch-bot bot commented Oct 21, 2025

Uh oh!

pytorch-bot bot commented Oct 21, 2025

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

Chao1Han commented Oct 22, 2025

Uh oh!

pytorchmergebot commented Oct 22, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Chao1Han commented Sep 16, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Sep 16, 2025 •

edited

Loading