-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[Distributed] getNumKeys API to c10d TCPStore #43962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
TCPStore needs a getNumKeys API for our logging needs. Differential Revision: [D22985085](https://our.internmc.facebook.com/intern/diff/D22985085/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D22985085/)! [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit c857acf (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
TCPStore needs a getNumKeys API for our logging needs. Differential Revision: [D22985085](https://our.internmc.facebook.com/intern/diff/D22985085/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D22985085/)! [ghstack-poisoned]
TCPStore needs a getNumKeys API for our logging needs. Differential Revision: [D22985085](https://our.internmc.facebook.com/intern/diff/D22985085/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D22985085/)! [ghstack-poisoned]
TCPStore needs a getNumKeys API for our logging needs. Differential Revision: [D22985085](https://our.internmc.facebook.com/intern/diff/D22985085/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D22985085/)! [ghstack-poisoned]
TCPStore needs a getNumKeys API for our logging needs. Differential Revision: [D22985085](https://our.internmc.facebook.com/intern/diff/D22985085/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D22985085/)! [ghstack-poisoned]
Codecov Report
@@ Coverage Diff @@
## gh/osalpekar/79/base #43962 +/- ##
========================================================
+ Coverage 68.06% 68.10% +0.04%
========================================================
Files 393 393
Lines 50987 50945 -42
========================================================
- Hits 34706 34698 -8
+ Misses 16281 16247 -34
Continue to review full report at Codecov.
|
| } | ||
|
|
||
| int64_t FileStoreHandler::getNumKeys() { | ||
| CHECK(false) << "getNumKeys not implemented for FileStoreHandler"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be TORCH_CHECK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in the Caffe2 codebase so I don't think we should use TORCH_CHECK. We could use CAFFE_ENFORCE, but the other unimplemented function in this class used CHECK(false) so I used it as well.
torch/lib/c10d/FileStore.cpp
Outdated
| } | ||
|
|
||
| int64_t FileStore::getNumKeys() { | ||
| throw std::runtime_error("getNumKeys not implemented for FileStore"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TORCH_CHECK here for consistency?
torch/lib/c10d/HashStore.cpp
Outdated
| } | ||
|
|
||
| int64_t HashStore::getNumKeys() { | ||
| throw std::runtime_error("getNumKeys not implemented for HashStore"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TORCH_CHECK/
lw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since currently this method (and deleteKey in the next pull request) is only supported in the TcpStore, does it make sense to add it to the virtual superclass? Most "users" of the stores (I am thinking of the RPC agents, but I suppose it's the same elsewhere) only accept a pointer to the virtual superclass. How can they reliably call getNumKeys on it, knowing that such a call would fail on most concrete implementations?
An alternative approach could be to add getNumKeys and deleteKey only to TcpStore, and not to the superclass. If someone wants to call these methods, they can first dynamic_cast the Store pointer to a TcpStore pointer (and, by doing so, confirm that it is indeed a TcpStore and not some other type). And, if that works, they can confidently call getNumKeys.
This PR adds a getNumKeys API to the TCP Store, which essentially returns the number of keys in the store at that point. This API will be useful for some applications related to debug logging in ProcessGroupNCCL going forward. This PR also adds some C++ tests for this API and Python tests are added in #45223. We will build on this functionality in the future by implementing this API for FileStore, HashStore, RedisStore, and ZeusStore. Differential Revision: [D22985085](https://our.internmc.facebook.com/intern/diff/D22985085/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D22985085/)! [ghstack-poisoned]
This PR adds a getNumKeys API to the TCP Store, which essentially returns the number of keys in the store at that point. This API will be useful for some applications related to debug logging in ProcessGroupNCCL going forward. This PR also adds some C++ tests for this API and Python tests are added in #45223. We will build on this functionality in the future by implementing this API for FileStore, HashStore, RedisStore, and ZeusStore. Differential Revision: [D22985085](https://our.internmc.facebook.com/intern/diff/D22985085/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D22985085/)! [ghstack-poisoned]
This PR adds a getNumKeys API to the TCP Store, which essentially returns the number of keys in the store at that point. This API will be useful for some applications related to debug logging in ProcessGroupNCCL going forward. This PR also adds some C++ tests for this API and Python tests are added in #45223. We will build on this functionality in the future by implementing this API for FileStore, HashStore, RedisStore, and ZeusStore. Differential Revision: [D22985085](https://our.internmc.facebook.com/intern/diff/D22985085/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D22985085/)! [ghstack-poisoned]
|
This pull request has been merged in 304e1d1. |
Stack from ghstack:
This PR adds a getNumKeys API to the TCP Store, which essentially returns the number of keys in the store at that point. This API will be useful for some applications related to debug logging in ProcessGroupNCCL going forward. This PR also adds some C++ tests for this API and Python tests are added in #45223. We will build on this functionality in the future by implementing this API for FileStore, HashStore, RedisStore, and ZeusStore.
Differential Revision: D22985085
NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on Phabricator!