Shard test_nn to reduce runtime for each test target #8678

yf225 · 2018-06-20T00:46:06Z

The current test_nn would time out and be disabled in GreenWarden, and we need to have an option to split it up in order to pass the stress test. Right now GreenWarden roughly allows running 100 test cases in test_nn before timing out, and here we have an option to divide test_nn into 30 shards (with ~40 tests in each shard) to allow for some test suite growth in the future.

This will unblock the addition of GPU tests to Sandcastle.

We need to make sure that the CUDA memory leak checks are still working correctly in CI after this change.

cc @orionr

test/test_nn.py

    fullname='AdaptiveLogSoftmax'))


+def disable_tests_not_in_shard(module, shard):


test/common.py

-                setattr(self, method_name, self.wrap_with_cuda_memory_check(test_method))
+        # This logic is similar to https://github.com/python/cpython/blob/master/Lib/unittest/case.py#L406-L413
+        try:
+            test_method = getattr(self, method_name)


yf225 · 2018-06-20T06:05:11Z

@pytorchbot retest this please

test/common.py

 parser = argparse.ArgumentParser(add_help=False)
 parser.add_argument('--seed', type=int, default=1234)
 parser.add_argument('--accept', action='store_true')
+parser.add_argument('--shard', type=int, required=False)


test/test_nn.py

              torch.double: 1e-5,
              torch.half: 1e-2}

+NUM_SHARDS = 30


test/test_nn.py

 if __name__ == '__main__':
+    options = parse_args()
+    if options.shard is not None:
+        def load_tests(loader, tests, pattern):


ezyang

Fix as many of the CR comments you can do easily, and then merge this please.

* upstream/master: (92 commits) more formatting (pytorch#8701) Fix pytorch#8692 (pytorch#8699) Create captured inputs recursively for loop to resolve loop-carried dependencies across nested blocks (pytorch#8345) Shard test_nn to reduce runtime for each test target (pytorch#8678) Create at::tensor (pytorch#8475) Clarify mp note about sharing a tensor's grad field. (pytorch#8688) Add owner rule for cpp_extension.py (pytorch#8700) fix formatting in :math: in fold docstring (pytorch#8696) Some 0-sized dimension support, port catArray away from resizeLegacy. (pytorch#8666) Implement flatten function (pytorch#8578) Created Tensor::to functions (pytorch#8643) Add a warning in gradcheck if inputs precision < float64 (pytorch#8663) Fix parsing of floating point defaults in python_arg_parser (pytorch#8681) Export ProcessGroupGloo options to Python (pytorch#8664) Fix build error in pybind_state_ideep (pytorch#8684) Compatibility: write nDimension/_nDimension corresponding to dim()/_dim(). (pytorch#8676) Improve win-build.sh for local build (pytorch#8674) don't do unnecessary copies for bernoulli_ (pytorch#8682) Use parallel if get_num_threads 0 (pytorch#8677) Fix serialization for Parameters (pytorch#8633) ...

* Shard test_nn to reduce runtime for each test target * Use load_tests for selecting tests to enable * fix lint * Use arg parser from common.py

Shard test_nn to reduce runtime for each test target

2da81cc

yf225 requested review from ezyang and ssnl June 20, 2018 00:46

yf225 requested review from apaszke, colesbury, gchanan, soumith and zdevito as code owners June 20, 2018 00:46

ssnl reviewed Jun 20, 2018

View reviewed changes

test/test_nn.py Outdated

fullname='AdaptiveLogSoftmax'))

def disable_tests_not_in_shard(module, shard):

This comment was marked as off-topic.

Sign in to view

ssnl reviewed Jun 20, 2018

View reviewed changes

Will Feng added 2 commits June 20, 2018 01:14

Use load_tests for selecting tests to enable

5dfb9a9

fix lint

1808fdc

apaszke reviewed Jun 20, 2018

View reviewed changes

ezyang approved these changes Jun 20, 2018

View reviewed changes

Use arg parser from common.py

6301e3f

yf225 merged commit d6c873a into pytorch:master Jun 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Shard test_nn to reduce runtime for each test target #8678

Shard test_nn to reduce runtime for each test target #8678

Uh oh!

yf225 commented Jun 20, 2018 •

edited

Loading

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

yf225 commented Jun 20, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		fullname='AdaptiveLogSoftmax'))


		def disable_tests_not_in_shard(module, shard):

Shard test_nn to reduce runtime for each test target #8678

Shard test_nn to reduce runtime for each test target #8678

Uh oh!

Conversation

yf225 commented Jun 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

yf225 commented Jun 20, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yf225 commented Jun 20, 2018 •

edited

Loading