Skip to content

Conversation

@kshitij12345
Copy link
Collaborator

@kshitij12345 kshitij12345 commented Nov 11, 2022

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 11, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88913

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Failures

As of commit fa63eaf:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@kshitij12345 kshitij12345 marked this pull request as ready for review November 25, 2022 15:53
@kshitij12345 kshitij12345 changed the title [DO-NOT-MERGE][follow-up] Python Attr Serialization [follow-up] Python Attr Serialization Nov 25, 2022
@kshitij12345
Copy link
Collaborator Author

@albanD do you think it would be good to merge this now? Is this past FC period?

@kshitij12345
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 28, 2022
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: PR #88913 has not been reviewed yet (Rule superuser)

Details for Dev Infra team Raised by workflow job

@kshitij12345
Copy link
Collaborator Author

Oops, Sorry! Wrong PR 😓🙏

Also, ping @albanD

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@albanD
Copy link
Collaborator

albanD commented Nov 29, 2022

@pytorchbot merge -g

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 2 additional jobs have failed, first few of them are: trunk ,trunk / cuda11.6-py3.10-gcc7-sm86 / test (default, 1, 4, linux.g5.4xlarge.nvidia.gpu)

Details for Dev Infra team Raised by workflow job

@kshitij12345
Copy link
Collaborator Author

pytorchbot merge -f"Unrelated CI failure: RuntimeError: test_jit_cuda_fuser failed! Received signal: SIGSEGV"

@kshitij12345
Copy link
Collaborator Author

@pytorchbot merge -f"Unrelated CI failure: RuntimeError: test_jit_cuda_fuser failed! Received signal: SIGSEGV"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@facebook-github-bot
Copy link
Contributor

@pytorchbot revert -m="Diff reverted internally" -c="ghfirst"

This Pull Request has been reverted by a revert inside Meta. To re-land this change, please open another pull request, assign the same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk).)

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

@kshitij12345 your PR has been successfully reverted.

pytorchmergebot added a commit that referenced this pull request Dec 2, 2022
This reverts commit 086b251.

Reverted #88913 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally
@mehtanirav
Copy link
Contributor

@kshitij12345 Unfortunately this PR had to been reverted as the earlier changes haven't been released yet to production owing to the code freeze. @singlaiiit can add more context if I missed anything.

@albanD albanD reopened this Dec 5, 2022
@kshitij12345
Copy link
Collaborator Author

For now, I have updated _acc_grad and _optimizer_hook_handles to be mapped to a parameter with a WeakTensorKeyDictionary (will follow-up in a separate PR for other attributes if it is deemed to be worth avoiding their serialisation).

The distributed tests are passing. Only failure in the Windows CI is unrelated. torch_test_cpp_extension.cpp fails with following error

2023-01-12T12:51:45.3403220Z FAILED: C:/actions-runner/_work/pytorch/pytorch/test/cpp_extensions/build/temp.win-amd64-cpython-39/Release/extension.obj 
2023-01-12T12:51:45.3406353Z cl /showIncludes /nologo /O2 /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\include -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\include\torch\csrc\api\include -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\include\TH -IC:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\include\THC -IC:\actions-runner\_work\pytorch\pytorch\test\cpp_extensions\self_compiler_include_dirs_test -IC:\Jenkins\Miniconda3\include -IC:\Jenkins\Miniconda3\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.28.29333\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\actions-runner\_work\pytorch\pytorch\test\cpp_extensions\extension.cpp /FoC:\actions-runner\_work\pytorch\pytorch\test\cpp_extensions\build\temp.win-amd64-cpython-39\Release\extension.obj /sdl /permissive- -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=cpp -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++17
2023-01-12T12:51:45.3409533Z C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\include\c10/macros/Macros.h(138): warning C4067: unexpected tokens following preprocessor directive - expected a newline
2023-01-12T12:51:45.3410543Z C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\include\c10/util/Optional.h(212): warning C4624: 'c10::constexpr_storage_t<T>': destructor was implicitly defined as deleted
2023-01-12T12:51:45.3411050Z         with
2023-01-12T12:51:45.3411272Z         [
2023-01-12T12:51:45.3411516Z             T=c10::SymInt
2023-01-12T12:51:45.3411752Z         ]

@kshitij12345
Copy link
Collaborator Author

Ping @albanD @awgu for another review. Thanks :)

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty clean!
Sounds good to me!

@kshitij12345
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: The following mandatory check(s) failed (Rule superuser):

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

@kshitij12345
Copy link
Collaborator Author

@pytorchbot merge -f"Unrelated Windows CI failure"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@albanD albanD added the module: bc-breaking Related to a BC-breaking change label Jan 19, 2023
@pytorch-bot pytorch-bot bot added the topic: bc breaking topic category label Jan 19, 2023
fegin added a commit that referenced this pull request Feb 10, 2023
…pickling errors

After #88913, user-defined parameter states will be pickled. For a FlatParameter, this means `_local_shard` will also be pickled. Since state_dict and load_state_dict only require the tensor, returning the full FlatParameter does not give us any extra benefit. This PR changes the behavior to simply return a view of the FlatParameter.

Differential Revision: [D43205127](https://our.internmc.facebook.com/intern/diff/D43205127/)

[ghstack-poisoned]
fegin added a commit that referenced this pull request Feb 12, 2023
…of FlatParameters to avoid pickling errors"

After #88913, user-defined parameter states will be pickled. For a FlatParameter, this means `_local_shard` will also be pickled. Since state_dict and load_state_dict only require the tensor, returning the full FlatParameter does not give us any extra benefit. This PR changes the behavior to simply return a view of the FlatParameter.

Differential Revision: [D43205127](https://our.internmc.facebook.com/intern/diff/D43205127/)

[ghstack-poisoned]
fegin added a commit that referenced this pull request Feb 12, 2023
…pickling errors

Pull Request resolved: #94637

After #88913, user-defined parameter states will be pickled. For a FlatParameter, this means `_local_shard` will also be pickled. Since state_dict and load_state_dict only require the tensor, returning the full FlatParameter does not give us any extra benefit. This PR changes the behavior to simply return a view of the FlatParameter.
ghstack-source-id: 179983735

Differential Revision: [D43205127](https://our.internmc.facebook.com/intern/diff/D43205127/)
fegin added a commit that referenced this pull request Feb 12, 2023
…s to avoid pickling errors"

After #88913, user-defined parameter states will be pickled. For a FlatParameter, this means `_local_shard` will also be pickled. Since state_dict and load_state_dict only require the tensor, returning the full FlatParameter does not give us any extra benefit. This PR changes the behavior to simply return a view of the FlatParameter.

Differential Revision: [D43205127](https://our.internmc.facebook.com/intern/diff/D43205127/)

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Feb 12, 2023
…pickling errors (#94637)

After #88913, user-defined parameter states will be pickled. For a FlatParameter, this means `_local_shard` will also be pickled. Since state_dict and load_state_dict only require the tensor, returning the full FlatParameter does not give us any extra benefit. This PR changes the behavior to simply return a view of the FlatParameter.

Differential Revision: [D43205127](https://our.internmc.facebook.com/intern/diff/D43205127/)
Pull Request resolved: #94637
Approved by: https://github.com/rohan-varma
awgu pushed a commit that referenced this pull request Feb 13, 2023
This reverts commit 745fe35.

[ghstack-poisoned]
awgu pushed a commit that referenced this pull request Feb 13, 2023
This reverts commit 745fe35.

ghstack-source-id: 47fc202
Pull Request resolved: #94741
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged module: bc-breaking Related to a BC-breaking change open source Reverted topic: bc breaking topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants