-
Notifications
You must be signed in to change notification settings - Fork 26.3k
c10d: retry dns lookup failures #74641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit db8033b (more details on the Dr. CI page): ✅ None of the CI failures appear to be your fault 💚
🚧 1 fixed upstream failure:These were probably caused by upstream breakages that were already fixed.
Please rebase on the
|
|
This pull request was exported from Phabricator. Differential Revision: D35092284 |
Summary: Pull Request resolved: pytorch#74641 This makes dns hostname lookup failures retryable since in some environments such as Kubernetes they're not guaranteed to be resolvable until the job starts. Retrying this eliminates the race condition. This also fixes `sandcastle_skip_if` when used on the class instead of the method. Previously they wouldn't inherit from TestCase so just wouldn't run under buck at all. Fixes pytorch#73682 Test Plan: Added a unit test ``` buck test //caffe2/test/distributed:test_store ``` Reviewed By: aivanou Differential Revision: D35092284 fbshipit-source-id: ab067f08a2cff6d6d7a17d3f1267226cb2f9293b
|
This pull request was exported from Phabricator. Differential Revision: D35092284 |
cbalioglu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary: Pull Request resolved: #74641 This makes dns hostname lookup failures retryable since in some environments such as Kubernetes they're not guaranteed to be resolvable until the job starts. Retrying this eliminates the race condition. This also fixes `sandcastle_skip_if` when used on the class instead of the method. Previously they wouldn't inherit from TestCase so just wouldn't run under buck at all. Fixes #73682 Test Plan: Added a unit test ``` buck test //caffe2/test/distributed:test_store ``` Reviewed By: aivanou Differential Revision: D35092284 fbshipit-source-id: d40bf187e52c41f551e4fe41c536b2b0015588ee
|
Hey @d4l3k. |
Summary: Pull Request resolved: #74641 This makes dns hostname lookup failures retryable since in some environments such as Kubernetes they're not guaranteed to be resolvable until the job starts. Retrying this eliminates the race condition. This also fixes `sandcastle_skip_if` when used on the class instead of the method. Previously they wouldn't inherit from TestCase so just wouldn't run under buck at all. Fixes #73682 Test Plan: Added a unit test ``` buck test //caffe2/test/distributed:test_store ``` Reviewed By: aivanou Differential Revision: D35092284 fbshipit-source-id: d40bf187e52c41f551e4fe41c536b2b0015588ee (cherry picked from commit f890830)
Summary:
This makes dns hostname lookup failures retryable since in some environments such as Kubernetes they're not guaranteed to be resolvable until the job starts. Retrying this eliminates the race condition.
Fixes #73682
Test Plan:
Added a unit test
Reviewed By: aivanou
Differential Revision: D35092284