Skip to content

Use stable topological sort in fuse_by_partitions#167397

Closed
SherlockNoMad wants to merge 7 commits intogh/SherlockNoMad/18/basefrom
gh/SherlockNoMad/18/head
Closed

Use stable topological sort in fuse_by_partitions#167397
SherlockNoMad wants to merge 7 commits intogh/SherlockNoMad/18/basefrom
gh/SherlockNoMad/18/head

Conversation

@SherlockNoMad
Copy link
Contributor

@SherlockNoMad SherlockNoMad commented Nov 8, 2025

Stack from ghstack (oldest at bottom):

legalize_graph() performs a topo sort that shuffles the nodes is a global way, making the result unpredictable.
We should avoid this in graph pass in general.

This problem is discovered when testing regional_inductor, a single fuse region trigger the global reordering.

Before
https://www.internalfb.com/intern/diffing/?before_paste_number=2029217728&after_paste_number=2029218006&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

After
https://www.internalfb.com/intern/diffing/?paste_number=2029162294&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

Left is gm before regional_inductor, right is after.

cc @ezyang @EikanWang @jgong5 @wenzhe-nrv

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 8, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167397

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 2 Pending, 1 Unrelated Failure

As of commit 059d973 with merge base c6ae757 (image):

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: fx release notes category label Nov 8, 2025
SherlockNoMad added a commit that referenced this pull request Nov 8, 2025
ghstack-source-id: 537c359
Pull Request resolved: #167397
@SherlockNoMad SherlockNoMad added the topic: not user facing topic category label Nov 8, 2025
erase_nodes(gm, sorted_nodes)

# topological sort original gm with newly created sub_gm
legalize_graph(gm)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

legalize_graph performs a topo sort that shuffles the nodes is a global way, making the result unpredictable.

We should avoid this in graph pass in general.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option here is to have a version of legalize graph that does a stable sort

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@SherlockNoMad SherlockNoMad requested a review from ezyang November 8, 2025 01:50
@ezyang ezyang requested a review from eellison November 8, 2025 21:48
@ezyang
Copy link
Contributor

ezyang commented Nov 8, 2025

I support the motivation for the change but I'm not an expert on the original pass here so I can't review. I do see CI failures as well. Do you want me to try to page this in?

Copy link
Contributor

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are doing manipulations by looking at DAG, you need to topo sort after, or do more work to keep it legalized while doing the matching. This is what we discussed in the meeting with @anijain2305.

Maybe stable_topological_sort will work better here ?

output_node = sub_gm.graph.output_node()

next_node = module_node.next
with gm.graph.inserting_before(next_node):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I sat down and understood the algorithm. The reason I think this doesn't work is because you can be in this situation:

input0 = ..
input1 = ...
output0 = ...
not_in_subgraph = f(output0)
input2 = ...
output1 = ...

Here, we DO need to do some code motion for the not_in_subgraph node, since it is "inside" the set of nodes we want to fuse and we need to push it to the end. But it's not necessarily obvious if we should push an arbitrary node in the middle of the fusion region to the end or the beginning. If instead it didn't depend on output0 but produced input2, then we would need to move it to the beginning. But I think you can make this work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, the UT failure caught this case.

I changed the impl to use stable topo sort instead.

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

algo no work


legalize_graph() performs a topo sort that shuffles the nodes is a global way, making the result unpredictable. 
We should avoid this in graph pass in general. 

This problem is discovered when testing regional_inductor, a single fuse region trigger the global reordering. 

Before
https://www.internalfb.com/intern/diffing/?before_paste_number=2029217728&after_paste_number=2029218006&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

After
https://www.internalfb.com/intern/diffing/?paste_number=2029162294&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

Left is gm before regional_inductor, right is after. 

cc ezyang EikanWang jgong5 wenzhe-nrv

[ghstack-poisoned]
SherlockNoMad added a commit that referenced this pull request Nov 11, 2025
ghstack-source-id: 0f2e961
Pull Request resolved: #167397
@SherlockNoMad SherlockNoMad changed the title Avoid topological sort in fuse_by_partitions Use stable topological sort in fuse_by_partitions Nov 11, 2025
@SherlockNoMad SherlockNoMad requested a review from ezyang November 11, 2025 00:28
@SherlockNoMad
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 11, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@jeanschmidt
Copy link
Contributor

@pytorchbot revert -m "seems to be breaking executorch signals internally, see D86780724" -c ghfirst

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request Nov 12, 2025
This reverts commit 7886070.

Reverted #167397 on behalf of https://github.com/jeanschmidt due to seems to be breaking executorch signals internally, see D86780724 ([comment](#167397 (comment)))
@pytorchmergebot pytorchmergebot added the ci-no-td Do not run TD on this PR label Nov 12, 2025
Silv3S pushed a commit to Silv3S/pytorch that referenced this pull request Nov 18, 2025
legalize_graph() performs a topo sort that shuffles the nodes is a global way, making the result unpredictable.
We should avoid this in graph pass in general.

This problem is discovered when testing regional_inductor, a single fuse region trigger the global reordering.

Before
https://www.internalfb.com/intern/diffing/?before_paste_number=2029217728&after_paste_number=2029218006&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

After
https://www.internalfb.com/intern/diffing/?paste_number=2029162294&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

Left is gm before regional_inductor, right is after.

Pull Request resolved: pytorch#167397
Approved by: https://github.com/ezyang
Silv3S pushed a commit to Silv3S/pytorch that referenced this pull request Nov 18, 2025
…7397)"

This reverts commit 7886070.

Reverted pytorch#167397 on behalf of https://github.com/jeanschmidt due to seems to be breaking executorch signals internally, see D86780724 ([comment](pytorch#167397 (comment)))

legalize_graph() performs a topo sort that shuffles the nodes is a global way, making the result unpredictable. 
We should avoid this in graph pass in general. 

This problem is discovered when testing regional_inductor, a single fuse region trigger the global reordering. 

Before
https://www.internalfb.com/intern/diffing/?before_paste_number=2029217728&after_paste_number=2029218006&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

After
https://www.internalfb.com/intern/diffing/?paste_number=2029162294&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

Left is gm before regional_inductor, right is after. 

cc ezyang EikanWang jgong5 wenzhe-nrv

[ghstack-poisoned]
SherlockNoMad added a commit that referenced this pull request Nov 24, 2025
ghstack-source-id: 810da0b
Pull Request resolved: #167397

legalize_graph() performs a topo sort that shuffles the nodes is a global way, making the result unpredictable. 
We should avoid this in graph pass in general. 

This problem is discovered when testing regional_inductor, a single fuse region trigger the global reordering. 

Before
https://www.internalfb.com/intern/diffing/?before_paste_number=2029217728&after_paste_number=2029218006&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

After
https://www.internalfb.com/intern/diffing/?paste_number=2029162294&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

Left is gm before regional_inductor, right is after. 

cc ezyang EikanWang jgong5 wenzhe-nrv

[ghstack-poisoned]
SherlockNoMad added a commit that referenced this pull request Nov 25, 2025
ghstack-source-id: 9d46e18
Pull Request resolved: #167397
@SherlockNoMad
Copy link
Contributor Author

@pytorchbot rebase


legalize_graph() performs a topo sort that shuffles the nodes is a global way, making the result unpredictable. 
We should avoid this in graph pass in general. 

This problem is discovered when testing regional_inductor, a single fuse region trigger the global reordering. 

Before
https://www.internalfb.com/intern/diffing/?before_paste_number=2029217728&after_paste_number=2029218006&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

After
https://www.internalfb.com/intern/diffing/?paste_number=2029162294&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

Left is gm before regional_inductor, right is after. 

cc ezyang EikanWang jgong5 wenzhe-nrv

[ghstack-poisoned]
SherlockNoMad added a commit that referenced this pull request Nov 26, 2025
ghstack-source-id: 91b23a4
Pull Request resolved: #167397

legalize_graph() performs a topo sort that shuffles the nodes is a global way, making the result unpredictable. 
We should avoid this in graph pass in general. 

This problem is discovered when testing regional_inductor, a single fuse region trigger the global reordering. 

Before
https://www.internalfb.com/intern/diffing/?before_paste_number=2029217728&after_paste_number=2029218006&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

After
https://www.internalfb.com/intern/diffing/?paste_number=2029162294&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

Left is gm before regional_inductor, right is after. 

cc ezyang EikanWang jgong5 wenzhe-nrv

[ghstack-poisoned]
SherlockNoMad added a commit that referenced this pull request Dec 3, 2025
ghstack-source-id: d56de85
Pull Request resolved: #167397

legalize_graph() performs a topo sort that shuffles the nodes is a global way, making the result unpredictable. 
We should avoid this in graph pass in general. 

This problem is discovered when testing regional_inductor, a single fuse region trigger the global reordering. 

Before
https://www.internalfb.com/intern/diffing/?before_paste_number=2029217728&after_paste_number=2029218006&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

After
https://www.internalfb.com/intern/diffing/?paste_number=2029162294&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

Left is gm before regional_inductor, right is after. 

cc ezyang EikanWang jgong5 wenzhe-nrv

[ghstack-poisoned]
@SherlockNoMad SherlockNoMad requested a review from albanD as a code owner December 3, 2025 15:53
SherlockNoMad added a commit that referenced this pull request Dec 3, 2025
ghstack-source-id: 5693a83
Pull Request resolved: #167397
@SherlockNoMad
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

"get_node_target",
"is_node_output_tensor",
"legalize_graph",
"stable_topological_sort",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New entries here are not ok. Please make sure to either properly document the function on the website or make it private by prepending an _

"Tuple",
"compatibility",
"legalize_graph",
"stable_topological_sort",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New entries here are not ok either. Make the API properly public.

@SherlockNoMad
Copy link
Contributor Author

I would prefer we don't revert this, the forward fix for D88319332 is #169541

JacobSzwejbka pushed a commit that referenced this pull request Dec 8, 2025
legalize_graph() performs a topo sort that shuffles the nodes is a global way, making the result unpredictable.
We should avoid this in graph pass in general.

This problem is discovered when testing regional_inductor, a single fuse region trigger the global reordering.

Before
https://www.internalfb.com/intern/diffing/?before_paste_number=2029217728&after_paste_number=2029218006&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

After
https://www.internalfb.com/intern/diffing/?paste_number=2029162294&regex_remove_pattern=&enable_regex_remove=0&strip_empty_lines=0&line_wrap=0&selected_tab=plain_diff

Left is gm before regional_inductor, right is after.

Pull Request resolved: #167397
Approved by: https://github.com/ezyang
JacobSzwejbka pushed a commit that referenced this pull request Dec 8, 2025
@github-actions github-actions bot deleted the gh/SherlockNoMad/18/head branch January 4, 2026 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/trunk Trigger trunk jobs on your pull request fx Merged release notes: fx release notes category Reverted topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants