Skip to content

Thread deterministic config vars to subproc compilation#165729

Closed
drisspg wants to merge 5 commits intogh/drisspg/210/basefrom
gh/drisspg/210/head
Closed

Thread deterministic config vars to subproc compilation#165729
drisspg wants to merge 5 commits intogh/drisspg/210/basefrom
gh/drisspg/210/head

Conversation

@drisspg
Copy link
Contributor

@drisspg drisspg commented Oct 17, 2025

Stack from ghstack (oldest at bottom):

Summary

TIL (AFTER WAYYYY TOO MUCH INSANITY), that we do not serialize the full set of configs for the subproc compilation.

I found this while working on Flex-attention determinism: meta-pytorch/attention-gym#168

might be good to audit if we need to thread through any more

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @mlazos

[ghstack-poisoned]
drisspg added a commit that referenced this pull request Oct 17, 2025
ghstack-source-id: 5c708f7
Pull-Request: #165729
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 17, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/165729

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 477046a with merge base 36371b8 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
drisspg added a commit that referenced this pull request Oct 17, 2025
ghstack-source-id: de237d3
Pull-Request: #165729
[ghstack-poisoned]
drisspg added a commit that referenced this pull request Oct 17, 2025
ghstack-source-id: 66e43b1
Pull-Request: #165729
[ghstack-poisoned]
drisspg added a commit that referenced this pull request Oct 17, 2025
ghstack-source-id: 5721e64
Pull-Request: #165729
@drisspg drisspg marked this pull request as ready for review October 17, 2025 17:54
@drisspg drisspg changed the title determnistic mode debugging Thread deterministic config vars to subproc compilation Oct 17, 2025
@shunting314
Copy link
Contributor

config overriding is usually added here: https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/triton.py#L4764

There are already multiple settings there.

[ghstack-poisoned]
drisspg added a commit that referenced this pull request Oct 17, 2025
ghstack-source-id: 53e69ea
Pull-Request: #165729
@drisspg
Copy link
Contributor Author

drisspg commented Oct 17, 2025

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 17, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Oct 19, 2025
@@ -2973,7 +2973,7 @@ def filter_reduction_configs_for_determinism(
def _do_filter_due_to_inductor_config():
return (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do anything defensively to avoid this happening again? E.g., factor out torch._inductor.config accesses in this file to a file that asserts that the attribute is in the fields that we patch to subprocess ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @shunting314

I like this idea and agree we should do something

Copy link
Contributor

@shunting314 shunting314 Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not hard to go thru the file and make sure there is no directly access to torch._inductor.config (e.g. access them thru inductor_meta dict), but it would be hard to

  1. avoid future code in this file from accessing inductor config directly
  2. avoid the code this file rely on from accessing inductor config directly. There accesses are harder to find.

Alternatively, maybe we should just detect and patch every config overriding when submitting a compilation task to child process.

Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Oct 21, 2025
# Summary

TIL (AFTER WAYYYY TOO MUCH INSANITY), that we do not serialize the full set of configs for the subproc compilation.

I found this while working on Flex-attention determinism: meta-pytorch/attention-gym#168

might be good to audit if we need to thread through any more

Pull Request resolved: pytorch#165729
Approved by: https://github.com/shunting314, https://github.com/eellison
Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Oct 21, 2025
zhudada0120 pushed a commit to zhudada0120/pytorch that referenced this pull request Oct 22, 2025
# Summary

TIL (AFTER WAYYYY TOO MUCH INSANITY), that we do not serialize the full set of configs for the subproc compilation.

I found this while working on Flex-attention determinism: meta-pytorch/attention-gym#168

might be good to audit if we need to thread through any more

Pull Request resolved: pytorch#165729
Approved by: https://github.com/shunting314, https://github.com/eellison
zhudada0120 pushed a commit to zhudada0120/pytorch that referenced this pull request Oct 22, 2025
@github-actions github-actions bot deleted the gh/drisspg/210/head branch November 20, 2025 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants