Skip to content

Conversation

@muthuArivoli
Copy link
Contributor

Related to #38349

@muthuArivoli muthuArivoli marked this pull request as draft August 10, 2020 07:44
@dr-ci
Copy link

dr-ci bot commented Aug 10, 2020

💊 CI failures summary and remediations

As of commit a03e866 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_macos_10_13_py3_test (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

Aug 14 11:06:56 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future
Aug 14 11:06:56 At: 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Aug 14 11:06:56  
Aug 14 11:06:56 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future 
Aug 14 11:06:56  
Aug 14 11:06:56 At: 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Aug 14 11:06:56  
Aug 14 11:06:56 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future 
Aug 14 11:06:56  
Aug 14 11:06:56 At: 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Aug 14 11:06:56   /Users/distiller/workspace/miniconda3/lib/python3.7/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Aug 14 11:06:56  
Aug 14 11:06:56 ok (1.377s) 
Aug 14 11:06:58   test_return_future_remote (__main__.ProcessGroupRpcTestWithSpawn) ... ok (1.331s) 
Aug 14 11:06:59   test_return_local_rrefs (__main__.ProcessGroupRpcTestWithSpawn) ... ok (1.377s) 
Aug 14 11:07:00   test_rpc_return_rref (__main__.ProcessGroupRpcTestWithSpawn) ... ok (1.401s) 
Aug 14 11:07:08   test_rpc_timeouts (__main__.ProcessGroupRpcTestWithSpawn) ... ok (7.890s) 

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 86 times.

@muthuArivoli muthuArivoli changed the title [WIP] Implement hstack, vstack, dstack Implement hstack, vstack, dstack Aug 11, 2020
@muthuArivoli muthuArivoli marked this pull request as ready for review August 11, 2020 01:37
@muthuArivoli
Copy link
Contributor Author

@mruberry PTAL

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: these examples are excellent but maybe

a = torch.tensor([[1],[2],[3]])
b = torch.tensor([[4],[5],[6]])

would be clearer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See numbering suggestion below.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See numbering suggestion above.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are good, but this test's case generation is limited to replicating the same tensor shape for each element of the input list. Here are some cases I was thinking about:

  • op(t)
    • the behavior of np.hstack(a) is strange and np.hstack(a) != np.hstack((a,)) (the same is true for np.dstack)
    • do we even support non-tuple arguments? if not we should validate this throws a runtime error
    • if we support single tensor arguments, is np.hstack(a)'s and np.dstack(a)'s behavior correct?
  • op((a, b, c, ...))
    • validating that if they differ on an unexpected dim an error is thrown (maybe _test_special_stacks should take a dim argument corresponding to the op?)
    • validating that if they differ only on the expected dim the result is equivalent to NumPy
  • are tensors with a size zero dim handled correctly? (if not that's OK, but let's assert it doesn't work)
  • np.hstack has special-handling of 1D tensors (as your implementation does), does test_hstack need a custom elaboration to test that behavior, in particular?
  • validating that tensors with different shapes but the same post-atleast_Xd shapes meet the criteria work

For an example of the last bullet:

a = np.array([[[1],[2],[3]]])
b = np.array((4, 5, 6))
np.dstack((a, b))
: array([[[1, 4],
          [2, 5],
          [3, 6]]])

This is a good number of cases but validating each one by hand shouldn't be too laborious, I hope.

What are your thoughts? Are there other cases I missed?

Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks excellent. A couple minor nits about the doc examples and questions about test coverage.

@zou3519 zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 11, 2020
@muthuArivoli
Copy link
Contributor Author

@mruberry I have added tests that I think cover all of the cases. They cover:

  • Runtime Error for non-tuple argument
  • 0 dimension support
  • For vstack and dstack, test with tensors with same post at_least shape, but different original shape
  • Test whether the function is accurate when the only dim that is different is the dim that will be concatenated
  • Test whether a runtime error is thrown when the changed dim is different than the one that is concatenated

For the last two, those are tested from 1 to 4 dimensions, so the special behavior for hstack is included with that. I have also added some autograd tests in a similar manner to the existing stack autograd test.

Does this sound good, or do I need more tests?

else:
# Invalid dimensions, test for error
with self.assertRaisesRegex(RuntimeError, "Sizes of tensors must match except in dimension"):
torch_fn(torch_input)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you add an assert that NumPy also throws a runtime error in this case? You don't need to assert a string is thrown:

with self.assertRaises(RuntimeError):
  np_fn(np_input)

Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, @muthuArivoli!

Would you just fix that one minor nit on the tests and we'll get this merged?

Let me know if you're interested in working on a new problem.

@muthuArivoli
Copy link
Contributor Author

@mruberry I added the numpy error check. Is it ok that numpy throws a ValueError, while we throw a RuntimeError?

Yes, I'm interested in working on a new problem, do you have any recommendations?

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@mruberry
Copy link
Collaborator

@mruberry I added the numpy error check. Is it ok that numpy throws a ValueError, while we throw a RuntimeError?

Absolutely OK. Nice work.

Yes, I'm interested in working on a new problem, do you have any recommendations?

For symmetry there are the split functions, hsplit, vsplit, and dsplit. A slightly more challenging binary function is divmod, because it returns two tensors. There are the "polynomial" functions, like polyadd and polyder, but I'm hoping someone will write all of them near simultaneously because they have a lot of common structure. There are also unary functions, like nan_to_num, that would be very helpful.

If you'd like something more exotic or especially numerically challenging there are also functions like the kaiser windowing function.

@muthuArivoli
Copy link
Contributor Author

Two questions:

  1. For hsplit, vsplit, dsplit, do the functions need to be exactly like their numpy equivalents, or can they just wrap around the torch.split function? The torch.split function is different than the np.split function when the second argument is a list, with np using indices and torch using lengths, but they essentially perform the same role.

  2. For the poly functions, it seems like the ones linked in the tracking issue have been superseded by np.polynomial, and that the main reason they are still around are for BC reasons. Should the implementation follow the functions linked in the tracking issue, or the ones linked here, and would a new namespace need to be created? Mainly asking since the representation of the degrees of the polynomial was reversed between the two.

@mruberry
Copy link
Collaborator

Two questions:

  1. For hsplit, vsplit, dsplit, do the functions need to be exactly like their numpy equivalents, or can they just wrap around the torch.split function? The torch.split function is different than the np.split function when the second argument is a list, with np using indices and torch using lengths, but they essentially perform the same role.
  2. For the poly functions, it seems like the ones linked in the tracking issue have been superseded by np.polynomial, and that the main reason they are still around are for BC reasons. Should the implementation follow the functions linked in the tracking issue, or the ones linked here, and would a new namespace need to be created? Mainly asking since the representation of the degrees of the polynomial was reversed between the two.

Excellent questions.

  1. Analogous to torch.split's behavior.
  2. You're correct and I should update the references to point to the newest ones in the new namespace, and you're right that would require adding a new namespace, which is kinda a pain to do. I should probably remove those until we decide to implement the polynomial namespace.

@facebook-github-bot
Copy link
Contributor

@mruberry merged this pull request in 5bcf9b0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants