[inductor] make mix order reduction work with dynamic shapes#168117
[inductor] make mix order reduction work with dynamic shapes#168117shunting314 wants to merge 8 commits intogh/shunting314/266/basefrom
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/168117
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 8ab71d2 with merge base 7392106 ( UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben [ghstack-poisoned]
|
Need a bit more work to properly print dynamic shape expressions |
|
@PaulZhang12 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475) [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475) [ghstack-poisoned]
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475) [ghstack-poisoned]
Internal models have dynamic shapes for rms/layer norm. Mix order reduction would be by-passed without this PR. co-author this PR with Paul Zhang. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475) [ghstack-poisoned]
|
@PaulZhang12 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1 similar comment
|
@PaulZhang12 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
torch/_inductor/codegen/simd.py
Outdated
| # since we already add guards in MixOrderReduction.can_fuse | ||
| # for dynamic shapes. | ||
| if not V.graph.sizevars.statically_known_gt( | ||
| numel, |
There was a problem hiding this comment.
I don't believe this is true. It can still be the case that this is always false and we have infinite recursion. Better to use the hint here?
There was a problem hiding this comment.
Same with the assert below of numel greater than rnumel
Internal models have dynamic shapes for rms/layer norm. Mix order reduction would be by-passed without this PR. co-author this PR with Paul Zhang. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475) [ghstack-poisoned]
Internal models have dynamic shapes for rms/layer norm. Mix order reduction would be by-passed without this PR. co-author this PR with Paul Zhang. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475) [ghstack-poisoned]
|
@PaulZhang12 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
| numel, | ||
| rnumel, | ||
| ) | ||
| assert V.graph.sizevars.evaluate_expr(sympy.Gt(numel, rnumel)) |
There was a problem hiding this comment.
just for my information as someone who's new: How are those expressions different? i guess one doesn't force guarding?
There was a problem hiding this comment.
right. For statically_known_gt, we return false if we can not decide due to dynamic shapes. For evaluate_expr, we will add guards and recompile if the guard fails.
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
| numel, | ||
| rnumel, | ||
| ): | ||
| if not V.graph.sizevars.evaluate_expr(sympy.Gt(numel, rnumel)): |
There was a problem hiding this comment.
do we not want to guard here ?
There was a problem hiding this comment.
I'm a bit conservative to adding guards. In real use cases we see, numel may have dynamic shape (e.g. jagged form of inputs), but rnumel is always static shape decided by model parameters
| if ( | ||
| shared_data_score < config.score_fusion_memory_threshold | ||
| ) and not MixOrderReduction.can_fuse(node1, node2): | ||
| if MixOrderReduction.can_fuse(node1, node2): |
There was a problem hiding this comment.
for my knowledge - why do we not care about shared dat a?
There was a problem hiding this comment.
score_fusion_memory returns 0 for MixOrderReduction (the next PR fixes that) so this PR skips the check. Not a big deal since we know mix order reduction always have decent shared data. (otherwise we don't fuse)
| # Call evaluate_expr rather than statically_known_geq since nrow can | ||
| # have dynamic shape in real models. | ||
| # Don't use hint directly since hint can be non-representative. | ||
| if not V.graph.sizevars.evaluate_expr(sympy.Ge(nrow * ncol, size_thres)): |
There was a problem hiding this comment.
similarly - wondering why not to guard?
There was a problem hiding this comment.
This one adds guard. nrow may contains dynamic shapes.
Internal models have dynamic shapes for rms/layer norm. Mix order reduction would be by-passed without this PR. co-author this PR with Paul Zhang. Differential Revision: [D87378475](https://our.internmc.facebook.com/intern/diff/D87378475) Pull Request resolved: #168117 Approved by: https://github.com/v0i0, https://github.com/jansel
Stack from ghstack (oldest at bottom):
Internal models have dynamic shapes for rms/layer norm. Mix order reduction would be by-passed without this PR.
co-author this PR with Paul Zhang.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben
Differential Revision: D87378475