-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[Environment Variable][1/N] Use thread-safe env variable API in c10 #119449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/119449
Note: Links to docs will display an error until the docs builds have been completed. ❌ 9 New FailuresAs of commit 6531b55 with merge base 6e10f7d ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
albanD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please write a better description at what you are trying to achieve here?
I haven't read the standard, but being familiar with the implementation details, I don't see why getenv on its own is not thread safe.
It is probably unsafe when there are a concurrent putenv calls, but you are not adding that one to your env utils.
[Edit] To quote https://en.cppreference.com/w/cpp/utility/program/getenv looks like it is part of the standard:
This function is thread-safe (calling it from multiple threads does not introduce a data race) as long as no other function modifies the host environment. In particular, the POSIX functions setenv(), unsetenv(), and putenv() would introduce a data race if called without synchronization.
[Edit2] Is this about suppressing some clang-tidy warnings or solving a real problem? If later, then solution is to provide thread-safe extern "C" implementation of those methods in the code, that would take over libc ones (all libc functions are weak symbols by design so that one can app can choose to override them) and provide thread safety from getenv calls fro say oneDNN, libcuda etc
|
I would expect that a user setting something on |
Yes, but the current implementation would do nothing to avoid it, as |
A specific reproducing example may be hard to create. Also note that Torch may be used in a pure multi-threading C++ environment. |
fa9f374 to
b760fdf
Compare
The purpose is to suppress clang-tidy warnings and solve possible future problems by providing thread-safe getenv and setenv. Overwriting libc implementations may not be a good idea because we have to support MSVC and other no GNU platforms so that I prefer to use explicit c10 API. |
|
@pytorchbot label "topic: not user facing" |
|
@pytorchbot label ciflow/trunk |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
5e05925 to
e66b7e7
Compare
|
commenting that this is a real issue we've seen in practice pytorch/kineto#984 but problematic interactions might be more likely between PyTorch and third party libraries as above |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Successfully rebased |
1372467 to
4c70b74
Compare
3137eed to
24841d9
Compare
|
Rewrite code to avoid changing existing API. |
|
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 5 jobs have failed, first few of them are: macos-arm64-binary-wheel / wheel-py3_9-cpu-build, macos-arm64-binary-wheel / wheel-py3_10-cpu-build, macos-arm64-binary-wheel / wheel-py3_12-cpu-build, macos-arm64-binary-wheel / wheel-py3_11-cpu-build, macos-arm64-binary-libtorch-cxx11-abi / libtorch-cpu-shared-with-deps-cxx11-abi-build Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot rebase -b main |
|
@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here |
|
Successfully rebased |
24841d9 to
6531b55
Compare
|
@pytorchbot merge -f "Unrelated failures" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…ytorch#119449) This PR is the beginning of attempts to wrap thread-unsafe getenv and set_env functions inside a RW mutex. Pull Request resolved: pytorch#119449 Approved by: https://github.com/malfet, https://github.com/albanD, https://github.com/eqy
This follows #119449 to make setenv thread-safe. Pull Request resolved: #124485 Approved by: https://github.com/eqy
This PR is the beginning of attempts to wrap thread-unsafe getenv and set_env functions inside a RW mutex.