Skip to content

[inductor] make mix order reduction work with dynamic shapes#168117

Closed
shunting314 wants to merge 8 commits intogh/shunting314/266/basefrom
gh/shunting314/266/head
Closed

[inductor] make mix order reduction work with dynamic shapes#168117
shunting314 wants to merge 8 commits intogh/shunting314/266/basefrom
gh/shunting314/266/head

Conversation

@shunting314
Copy link
Contributor

@shunting314 shunting314 commented Nov 18, 2025

shunting314 added a commit that referenced this pull request Nov 18, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 18, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/168117

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 8ab71d2 with merge base 7392106 (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben

[ghstack-poisoned]
shunting314 added a commit that referenced this pull request Nov 18, 2025
@shunting314 shunting314 marked this pull request as draft November 18, 2025 22:27
@shunting314
Copy link
Contributor Author

Need a bit more work to properly print dynamic shape expressions

@PaulZhang12
Copy link
Contributor

@PaulZhang12 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 18, 2025
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben

Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475)

[ghstack-poisoned]
shunting314 added a commit that referenced this pull request Nov 18, 2025
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben

Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475)

[ghstack-poisoned]
shunting314 added a commit that referenced this pull request Nov 19, 2025
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben

Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475)

[ghstack-poisoned]
shunting314 added a commit that referenced this pull request Nov 19, 2025
@shunting314 shunting314 marked this pull request as ready for review November 19, 2025 00:45

Internal models have dynamic shapes for rms/layer norm. Mix order reduction would be by-passed without this PR.

co-author this PR with Paul Zhang.


cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben

Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475)

[ghstack-poisoned]
shunting314 added a commit that referenced this pull request Nov 19, 2025
@PaulZhang12
Copy link
Contributor

@PaulZhang12 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@PaulZhang12
Copy link
Contributor

@PaulZhang12 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

# since we already add guards in MixOrderReduction.can_fuse
# for dynamic shapes.
if not V.graph.sizevars.statically_known_gt(
numel,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this is true. It can still be the case that this is always false and we have infinite recursion. Better to use the hint here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with the assert below of numel greater than rnumel


Internal models have dynamic shapes for rms/layer norm. Mix order reduction would be by-passed without this PR.

co-author this PR with Paul Zhang.


cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben

Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475)

[ghstack-poisoned]
shunting314 added a commit that referenced this pull request Nov 19, 2025

Internal models have dynamic shapes for rms/layer norm. Mix order reduction would be by-passed without this PR.

co-author this PR with Paul Zhang.


cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben

Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475)

[ghstack-poisoned]
@shunting314 shunting314 requested a review from v0i0 November 20, 2025 01:17
@PaulZhang12
Copy link
Contributor

@PaulZhang12 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

numel,
rnumel,
)
assert V.graph.sizevars.evaluate_expr(sympy.Gt(numel, rnumel))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for my information as someone who's new: How are those expressions different? i guess one doesn't force guarding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. For statically_known_gt, we return false if we can not decide due to dynamic shapes. For evaluate_expr, we will add guards and recompile if the guard fails.

@shunting314 shunting314 added the topic: not user facing topic category label Nov 20, 2025
@shunting314
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

numel,
rnumel,
):
if not V.graph.sizevars.evaluate_expr(sympy.Gt(numel, rnumel)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we not want to guard here ?

Copy link
Contributor Author

@shunting314 shunting314 Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit conservative to adding guards. In real use cases we see, numel may have dynamic shape (e.g. jagged form of inputs), but rnumel is always static shape decided by model parameters

if (
shared_data_score < config.score_fusion_memory_threshold
) and not MixOrderReduction.can_fuse(node1, node2):
if MixOrderReduction.can_fuse(node1, node2):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my knowledge - why do we not care about shared dat a?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

score_fusion_memory returns 0 for MixOrderReduction (the next PR fixes that) so this PR skips the check. Not a big deal since we know mix order reduction always have decent shared data. (otherwise we don't fuse)

# Call evaluate_expr rather than statically_known_geq since nrow can
# have dynamic shape in real models.
# Don't use hint directly since hint can be non-representative.
if not V.graph.sizevars.evaluate_expr(sympy.Ge(nrow * ncol, size_thres)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly - wondering why not to guard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one adds guard. nrow may contains dynamic shapes.

JacobSzwejbka pushed a commit that referenced this pull request Dec 8, 2025
Internal models have dynamic shapes for rms/layer norm. Mix order reduction would be by-passed without this PR.

co-author this PR with Paul Zhang.

Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475)
Pull Request resolved: #168117
Approved by: https://github.com/v0i0, https://github.com/jansel
@github-actions github-actions bot deleted the gh/shunting314/266/head branch December 21, 2025 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants