explicitly remove call_mod_node_to_replace after inlining the submodule in const_fold._inline_module` by jazlyn5 · Pull Request #166871 · pytorch/pytorch

jazlyn5 · 2025-11-03T16:53:22Z

Summary:
#166609 updated is_impure check to now check ops inside a subgraph to decide whether a call_module node is pure or not.

This change of behavior affects dead code elimination, commonly run as gm.graph.eliminate_dead_code(). Specifically, dead code elimination will not erase a node that has no users if this node has side effect or is impure. With above mentioned pr, dead code elimination no longer eliminates unused subgraphs that contain side-effectful ops.

This affects const_fold.split_const_subgraph, what this function does is:

split a graph into two submodules, one containing all const ops and one containing non-const ops
inline the submodule containing non-const ops back to main graph.
run dead code elimination to remove the unused non-const submodule.

With pr #166609 step 3 no longer erases the unused module. As an example, exported graph

 graph():
    %x : [num_users=2] = placeholder[target=x]
    %_guards_fn : [num_users=0] = call_module[target=_guards_fn](args = (%x,), kwargs = {})
    %empty_permuted : [num_users=1] = call_function[target=torch.ops.aten.empty_permuted.default](args = ([5, 10], [0, 1]), kwargs = {device: cpu, pin_memory: False})
    %bernoulli : [num_users=1] = call_function[target=torch.ops.aten.bernoulli.p](args = (%empty_permuted, 0.6), kwargs = {})
    %mul : [num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%x, %bernoulli), kwargs = {})
    %div : [num_users=1] = call_function[target=torch.ops.aten.div.Tensor](args = (%mul, 0.6), kwargs = {})
    return (div,)

After running const_fold, empty_permuted is const-folded, the rest of ops are not, and the main graph looks like

graph():
    %x : [num_users=3] = placeholder[target=x]
    %_fx_const_folded_attrs : [num_users=2] = get_attr[target=_FX_CONST_FOLDED_ATTRS]
    %_guards_fn : [num_users=0] = call_module[target=_guards_fn](args = (%x,), kwargs = {})
    %bernoulli_p : [num_users=1] = call_function[target=torch.ops.aten.bernoulli.p](args = (%_fx_const_folded_attrs, 0.6), kwargs = {})
    %mul_tensor : [num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%x, %bernoulli_p), kwargs = {})
    %div_tensor : [num_users=1] = call_function[target=torch.ops.aten.div.Tensor](args = (%mul_tensor, 0.6), kwargs = {})
    %submod_1 : [num_users=0] = call_module[target=submod_1](args = (%x, %_fx_const_folded_attrs), kwargs = {})
    return (div_tensor,)

submod_1 is dangling, unused, and just inlined into the graph.

Fix

This pr updates const_fold._inline_module function to explicitly remove the non-const submodule which is unused, after it has inlined the submodule's ops into main graph.

Test Plan:
Added a test in test_fx_const_fold.py.

The test would have failed before this PR becuase it yields above example graph leaving an unused call_module[target=submod_1] op.

With the PR, the module is erased from main graph correctly.

Differential Revision: D86056354

cc @ezyang @EikanWang @jgong5 @wenzhe-nrv

pytorch-bot · 2025-11-03T16:53:25Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166871

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ROCm failures during provisioning step due to network issues

✅ No Failures

As of commit 12a0499 with merge base aa4a8c9 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2025-11-03T16:53:30Z

@jazlyn5 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D86056354.

…le in const_fold._inline_module` (pytorch#166871) Summary: pytorch#166609 updated `is_impure` check to now check ops inside a subgraph to decide whether a `call_module` node is pure or not. This change of behavior affects dead code elimination, commonly run as `gm.graph.eliminate_dead_code()`. Specifically, dead code elimination will not erase a node that has no users if this node has side effect or is impure. With above mentioned pr, dead code elimination no longer eliminates unused subgraphs that contain side-effectful ops. This affects `const_fold.split_const_subgraph`, what this function does is: 1. split a graph into two submodules, one containing all const ops and one containing non-const ops 2. inline the submodule containing non-const ops back to main graph. 3. run dead code elimination to remove the unused non-const submodule. With pr pytorch#166609 step 3 no longer erases the unused module. As an example, exported graph ``` graph(): %x : [num_users=2] = placeholder[target=x] %_guards_fn : [num_users=0] = call_module[target=_guards_fn](args = (%x,), kwargs = {}) %empty_permuted : [num_users=1] = call_function[target=torch.ops.aten.empty_permuted.default](args = ([5, 10], [0, 1]), kwargs = {device: cpu, pin_memory: False}) %bernoulli : [num_users=1] = call_function[target=torch.ops.aten.bernoulli.p](args = (%empty_permuted, 0.6), kwargs = {}) %mul : [num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%x, %bernoulli), kwargs = {}) %div : [num_users=1] = call_function[target=torch.ops.aten.div.Tensor](args = (%mul, 0.6), kwargs = {}) return (div,) ``` After running const_fold, empty_permuted is const-folded, the rest of ops are not, and the main graph looks like ``` graph(): %x : [num_users=3] = placeholder[target=x] %_fx_const_folded_attrs : [num_users=2] = get_attr[target=_FX_CONST_FOLDED_ATTRS] %_guards_fn : [num_users=0] = call_module[target=_guards_fn](args = (%x,), kwargs = {}) %bernoulli_p : [num_users=1] = call_function[target=torch.ops.aten.bernoulli.p](args = (%_fx_const_folded_attrs, 0.6), kwargs = {}) %mul_tensor : [num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%x, %bernoulli_p), kwargs = {}) %div_tensor : [num_users=1] = call_function[target=torch.ops.aten.div.Tensor](args = (%mul_tensor, 0.6), kwargs = {}) %submod_1 : [num_users=0] = call_module[target=submod_1](args = (%x, %_fx_const_folded_attrs), kwargs = {}) return (div_tensor,) ``` `submod_1` is dangling, unused, and just inlined into the graph. ## Fix This pr updates `const_fold._inline_module` function to explicitly remove the non-const submodule which is unused, after it has inlined the submodule's ops into main graph. Test Plan: Added a test in `test_fx_const_fold.py`. The test would have failed before this PR becuase it yields above example graph leaving an unused `call_module[target=submod_1]` op. With the PR, the module is erased from main graph correctly. Differential Revision: D86056354

blaine-rister · 2025-11-03T17:40:28Z

torch/fx/experimental/const_fold.py

+    # Explicitly remove the module that was just inlined,
+    # there should not be any users, but check just in case.
+    if len(call_mod_node_to_replace.users) == 0:
+        gm.graph.erase_node(call_mod_node_to_replace)


Should we assert this if we expect it to always be true? Although that may be redundant, as I have a feeling gm.graph.erase_node will raise an error if the node being removed has users.

Suggested change

# Explicitly remove the module that was just inlined,

# there should not be any users, but check just in case.

if len(call_mod_node_to_replace.users) == 0:

gm.graph.erase_node(call_mod_node_to_replace)

# Remove the module that was just inlined.

assert len(call_mod_node_to_replace.users) == 0, "Failed to erase inlined submodule because it is still in use!"

gm.graph.erase_node(call_mod_node_to_replace)

I guess it depends on whether the pass would still function properly if this were false. Would the transform be sound if we didn't erase the node, or would we end up with 2 copies of the module, one inlined and one outlined?

So the transformed graph is still runnable if you don't remove the unused call_module node. But we certainly do expect to erase the node from the logic in _inline_module and _verify_const_fold_mod in test_fx_const_fold.py.

I have a feeling gm.graph.erase_node will raise an error if we try to remove a node with users

yup tested this is correct, erasing a node being used will raise error.

Should we assert this if we expect it to always be true?

from my understanding and code read yes we expect len(call_mod_node_to_replace.users) == 0 . I was being extra cautious with an if check just in case, since the graph is still runnable if we don't end up removing this module.

blaine-rister

Mostly LGTM. Stamping this to unblock, but please see my comment below before merging.

…le in const_fold._inline_module` (pytorch#166871) Summary: pytorch#166609 updated `is_impure` check to now check ops inside a subgraph to decide whether a `call_module` node is pure or not. This change of behavior affects dead code elimination, commonly run as `gm.graph.eliminate_dead_code()`. Specifically, dead code elimination will not erase a node that has no users if this node has side effect or is impure. With above mentioned pr, dead code elimination no longer eliminates unused subgraphs that contain side-effectful ops. This affects `const_fold.split_const_subgraph`, what this function does is: 1. split a graph into two submodules, one containing all const ops and one containing non-const ops 2. inline the submodule containing non-const ops back to main graph. 3. run dead code elimination to remove the unused non-const submodule. With pr pytorch#166609 step 3 no longer erases the unused module. As an example, exported graph ``` graph(): %x : [num_users=2] = placeholder[target=x] %_guards_fn : [num_users=0] = call_module[target=_guards_fn](args = (%x,), kwargs = {}) %empty_permuted : [num_users=1] = call_function[target=torch.ops.aten.empty_permuted.default](args = ([5, 10], [0, 1]), kwargs = {device: cpu, pin_memory: False}) %bernoulli : [num_users=1] = call_function[target=torch.ops.aten.bernoulli.p](args = (%empty_permuted, 0.6), kwargs = {}) %mul : [num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%x, %bernoulli), kwargs = {}) %div : [num_users=1] = call_function[target=torch.ops.aten.div.Tensor](args = (%mul, 0.6), kwargs = {}) return (div,) ``` After running const_fold, empty_permuted is const-folded, the rest of ops are not, and the main graph looks like ``` graph(): %x : [num_users=3] = placeholder[target=x] %_fx_const_folded_attrs : [num_users=2] = get_attr[target=_FX_CONST_FOLDED_ATTRS] %_guards_fn : [num_users=0] = call_module[target=_guards_fn](args = (%x,), kwargs = {}) %bernoulli_p : [num_users=1] = call_function[target=torch.ops.aten.bernoulli.p](args = (%_fx_const_folded_attrs, 0.6), kwargs = {}) %mul_tensor : [num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%x, %bernoulli_p), kwargs = {}) %div_tensor : [num_users=1] = call_function[target=torch.ops.aten.div.Tensor](args = (%mul_tensor, 0.6), kwargs = {}) %submod_1 : [num_users=0] = call_module[target=submod_1](args = (%x, %_fx_const_folded_attrs), kwargs = {}) return (div_tensor,) ``` `submod_1` is dangling, unused, and just inlined into the graph. ## Fix This pr updates `const_fold._inline_module` function to explicitly remove the non-const submodule which is unused, after it has inlined the submodule's ops into main graph. Test Plan: Added a test in `test_fx_const_fold.py`. The test would have failed before this PR becuase it yields above example graph leaving an unused `call_module[target=submod_1]` op. With the PR, the module is erased from main graph correctly. Reviewed By: blaine-rister Differential Revision: D86056354

jazlyn5 · 2025-11-03T23:14:54Z

@pytorchbot merge

pytorchmergebot · 2025-11-03T23:17:27Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…le in const_fold._inline_module` (#166871) Summary: #166609 updated `is_impure` check to now check ops inside a subgraph to decide whether a `call_module` node is pure or not. This change of behavior affects dead code elimination, commonly run as `gm.graph.eliminate_dead_code()`. Specifically, dead code elimination will not erase a node that has no users if this node has side effect or is impure. With above mentioned pr, dead code elimination no longer eliminates unused subgraphs that contain side-effectful ops. This affects `const_fold.split_const_subgraph`, what this function does is: 1. split a graph into two submodules, one containing all const ops and one containing non-const ops 2. inline the submodule containing non-const ops back to main graph. 3. run dead code elimination to remove the unused non-const submodule. With pr #166609 step 3 no longer erases the unused module. As an example, exported graph ``` graph(): %x : [num_users=2] = placeholder[target=x] %_guards_fn : [num_users=0] = call_module[target=_guards_fn](args = (%x,), kwargs = {}) %empty_permuted : [num_users=1] = call_function[target=torch.ops.aten.empty_permuted.default](args = ([5, 10], [0, 1]), kwargs = {device: cpu, pin_memory: False}) %bernoulli : [num_users=1] = call_function[target=torch.ops.aten.bernoulli.p](args = (%empty_permuted, 0.6), kwargs = {}) %mul : [num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%x, %bernoulli), kwargs = {}) %div : [num_users=1] = call_function[target=torch.ops.aten.div.Tensor](args = (%mul, 0.6), kwargs = {}) return (div,) ``` After running const_fold, empty_permuted is const-folded, the rest of ops are not, and the main graph looks like ``` graph(): %x : [num_users=3] = placeholder[target=x] %_fx_const_folded_attrs : [num_users=2] = get_attr[target=_FX_CONST_FOLDED_ATTRS] %_guards_fn : [num_users=0] = call_module[target=_guards_fn](args = (%x,), kwargs = {}) %bernoulli_p : [num_users=1] = call_function[target=torch.ops.aten.bernoulli.p](args = (%_fx_const_folded_attrs, 0.6), kwargs = {}) %mul_tensor : [num_users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%x, %bernoulli_p), kwargs = {}) %div_tensor : [num_users=1] = call_function[target=torch.ops.aten.div.Tensor](args = (%mul_tensor, 0.6), kwargs = {}) %submod_1 : [num_users=0] = call_module[target=submod_1](args = (%x, %_fx_const_folded_attrs), kwargs = {}) return (div_tensor,) ``` `submod_1` is dangling, unused, and just inlined into the graph. ## Fix This pr updates `const_fold._inline_module` function to explicitly remove the non-const submodule which is unused, after it has inlined the submodule's ops into main graph. Test Plan: Added a test in `test_fx_const_fold.py`. The test would have failed before this PR becuase it yields above example graph leaving an unused `call_module[target=submod_1]` op. With the PR, the module is erased from main graph correctly. Differential Revision: D86056354 Pull Request resolved: #166871 Approved by: https://github.com/blaine-rister, https://github.com/mlazos

pytorch-bot bot added the release notes: fx release notes category label Nov 3, 2025

facebook-github-bot added the fx label Nov 3, 2025

meta-codesync bot added fb-exported meta-exported labels Nov 3, 2025

jazlyn5 force-pushed the export-D86056354 branch from c1fbf22 to ddd68dc Compare November 3, 2025 17:33

blaine-rister requested review from eellison and mlazos November 3, 2025 17:38

blaine-rister reviewed Nov 3, 2025

View reviewed changes

blaine-rister approved these changes Nov 3, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 3, 2025

mlazos approved these changes Nov 3, 2025

View reviewed changes

jazlyn5 force-pushed the export-D86056354 branch from ddd68dc to 12a0499 Compare November 3, 2025 18:48

pytorchmergebot added the merging label Nov 3, 2025

pytorchmergebot added the Merged label Nov 3, 2025

pytorchmergebot closed this in 5048e47 Nov 3, 2025

pytorchmergebot removed the merging label Nov 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

explicitly remove call_mod_node_to_replace after inlining the submodule in const_fold._inline_module`#166871

explicitly remove call_mod_node_to_replace after inlining the submodule in const_fold._inline_module`#166871
jazlyn5 wants to merge 1 commit intopytorch:mainfrom
jazlyn5:export-D86056354

jazlyn5 commented Nov 3, 2025 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Nov 3, 2025 •

edited

Loading

Uh oh!

meta-codesync bot commented Nov 3, 2025

Uh oh!

blaine-rister Nov 3, 2025 •

edited

Loading

Uh oh!

jazlyn5 Nov 3, 2025

Uh oh!

blaine-rister left a comment

Uh oh!

jazlyn5 commented Nov 3, 2025

Uh oh!

pytorchmergebot commented Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

jazlyn5 commented Nov 3, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix

Uh oh!

pytorch-bot bot commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166871

❗ 1 Active SEVs

✅ No Failures

Uh oh!

meta-codesync bot commented Nov 3, 2025

Uh oh!

blaine-rister Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jazlyn5 Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

blaine-rister left a comment

Choose a reason for hiding this comment

Uh oh!

jazlyn5 commented Nov 3, 2025

Uh oh!

pytorchmergebot commented Nov 3, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jazlyn5 commented Nov 3, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Nov 3, 2025 •

edited

Loading

blaine-rister Nov 3, 2025 •

edited

Loading