-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Better support for adding zero-filled sparse tensors #7479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I think this patch leads to a pretty important question: Should our sparse library always treat the number of dense and sparse dimensions as an implementation detail, and allow automatic conversions between the types, or should they have hard checks for mismatches (so they effectively become part of the tensor type in a way). I'd slightly lean towards the first option, but it would be good if we pursued it not only in the all zero case, but in all situations (not necessary as this PR, but we should plan that). Otherwise, it will be really inconsistent with the rest of the library. |
|
I agree with your point @apaszke that if we go through with this, we should plan for it in the future. I'm not sure which approach (automatically allowing conversion between types, or treating them separately) is better. There are some operations (like transpose) for which allowing automatic conversion between types might not be desirable. Let's say we have a sparse |
|
Given that we support dense-sparse operations, this principle extends to varying levels of sparsity. But without a good strategy for actually handling this, you'll be playing whackamole to get this working uniformly everything. Unfortunately, C10 will help with the dense-sparse case, but not more sparse-less sparse case. Maybe the wrapper mechanism we're building might be able to help. Transpose seems like a different beast altogether. |
|
The context behind this is that there's a bug in nn.Embedding backward (and any backwards that returns sparse gradients) where if a gradient is a zero-filled sparse tensor (from using the padding_idx argument in nn.Embedding), accumulating gradients doesn't work because a zero-filled sparse tensor always has |
|
@zou3519 needs rebase, i think, for onnx test to pass |
Right now, if we add a zero-filled sparse tensor with another sparse tensor, both tensors must have the same "density" (dimI, dimV) and size (tensor.size()) for them to be added successfully. This relaxes that constraint so that if both tensors have the same tensor.size() and at least one is zero-filled, they can be added successfully. Before: ``` i = torch.LongTensor([[0, 1, 1], [2, 0, 2]]) v = torch.FloatTensor([3, 4, 5]).unsqueeze(1) sparse_mat = torch.sparse.FloatTensor(i, v, torch.Size([2,3,1])) zeros = torch.zeros(sparse_mat.size(), layout=torch.sparse_coo) sparse_mat + zeros RuntimeError: cadd operands have incompatible sizes or dimension types at ../src/THS/generic/THSTensorMath.c:126 ``` After: no error.
|
This situation is somewhat similar to contiguity; some operations support both contiguous and non-contiguous inputs, while some only support contiguous inputs (with the obvious difference being that here the property is between different tensors instead of standalone). We could just give a name to this (congruent?) and give error messages to that effect. |
Right now, if we add a zero-filled sparse tensor with another sparse tensor, both tensors must have the same "density" (dimI, dimV) and size (tensor.size()) for them to be added successfully. This relaxes that constraint so that if both tensors have the same tensor.size() and at least one is zero-filled, they can be added successfully. Before: ``` i = torch.LongTensor([[0, 1, 1], [2, 0, 2]]) v = torch.FloatTensor([3, 4, 5]).unsqueeze(1) sparse_mat = torch.sparse.FloatTensor(i, v, torch.Size([2,3,1])) zeros = torch.zeros(sparse_mat.size(), layout=torch.sparse_coo) sparse_mat + zeros RuntimeError: cadd operands have incompatible sizes or dimension types at ../src/THS/generic/THSTensorMath.c:126 ``` After: no error.
Right now, if we add a zero-filled sparse tensor with another sparse
tensor, both tensors must have the same "density" (dimI, dimV) and size
(tensor.size()) for them to be added successfully. This relaxes that
constraint so that if both tensors have the same tensor.size() and at
least one is zero-filled, they can be added successfully.
Before:
After: no error.