-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Shard test_nn to reduce runtime for each test target #8678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
test/test_nn.py
Outdated
| fullname='AdaptiveLogSoftmax')) | ||
|
|
||
|
|
||
| def disable_tests_not_in_shard(module, shard): |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
test/common.py
Outdated
| setattr(self, method_name, self.wrap_with_cuda_memory_check(test_method)) | ||
| # This logic is similar to https://github.com/python/cpython/blob/master/Lib/unittest/case.py#L406-L413 | ||
| try: | ||
| test_method = getattr(self, method_name) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@pytorchbot retest this please |
test/common.py
Outdated
| parser = argparse.ArgumentParser(add_help=False) | ||
| parser.add_argument('--seed', type=int, default=1234) | ||
| parser.add_argument('--accept', action='store_true') | ||
| parser.add_argument('--shard', type=int, required=False) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
test/test_nn.py
Outdated
| torch.double: 1e-5, | ||
| torch.half: 1e-2} | ||
|
|
||
| NUM_SHARDS = 30 |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| if __name__ == '__main__': | ||
| options = parse_args() | ||
| if options.shard is not None: | ||
| def load_tests(loader, tests, pattern): |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
ezyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix as many of the CR comments you can do easily, and then merge this please.
* upstream/master: (92 commits) more formatting (pytorch#8701) Fix pytorch#8692 (pytorch#8699) Create captured inputs recursively for loop to resolve loop-carried dependencies across nested blocks (pytorch#8345) Shard test_nn to reduce runtime for each test target (pytorch#8678) Create at::tensor (pytorch#8475) Clarify mp note about sharing a tensor's grad field. (pytorch#8688) Add owner rule for cpp_extension.py (pytorch#8700) fix formatting in :math: in fold docstring (pytorch#8696) Some 0-sized dimension support, port catArray away from resizeLegacy. (pytorch#8666) Implement flatten function (pytorch#8578) Created Tensor::to functions (pytorch#8643) Add a warning in gradcheck if inputs precision < float64 (pytorch#8663) Fix parsing of floating point defaults in python_arg_parser (pytorch#8681) Export ProcessGroupGloo options to Python (pytorch#8664) Fix build error in pybind_state_ideep (pytorch#8684) Compatibility: write nDimension/_nDimension corresponding to dim()/_dim(). (pytorch#8676) Improve win-build.sh for local build (pytorch#8674) don't do unnecessary copies for bernoulli_ (pytorch#8682) Use parallel if get_num_threads 0 (pytorch#8677) Fix serialization for Parameters (pytorch#8633) ...
* Shard test_nn to reduce runtime for each test target * Use load_tests for selecting tests to enable * fix lint * Use arg parser from common.py
The current
test_nnwould time out and be disabled in GreenWarden, and we need to have an option to split it up in order to pass the stress test. Right now GreenWarden roughly allows running 100 test cases intest_nnbefore timing out, and here we have an option to dividetest_nninto 30 shards (with ~40 tests in each shard) to allow for some test suite growth in the future.This will unblock the addition of GPU tests to Sandcastle.
We need to make sure that the CUDA memory leak checks are still working correctly in CI after this change.
cc @orionr