-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Fix output_nr not incremented correctly #4812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Sorry about miscommunicating; can we get a hard coded test for this particular regression? (@apaszke, you think this would be good to have, right?) |
|
Ouch. It would be good, but testing all configurations can make the tests a lot slower (2^{number of inputs} is quite a lot) 😕 |
|
While updating an existing test to cover this case, I found another bug: This is caused by this check here only covering I attempted to fix by including all params in the check, but |
|
Ick. THNN bindings: the gift that keeps giving X( |
…meters if any param (not just weight) requires grad in parse_nn.py
ee63332 to
5e81cdd
Compare
…arameters with only bias requiring grad
4adf61b to
6f30587
Compare
|
This is ready for review now! There are changes to all THNN functions that update more than 1 parameters in
|
| "bias tensor has to be contiguous"); | ||
| input = THCTensor_(newContiguous)(state, input); | ||
| weight = THCTensor_(newContiguous)(state, weight); | ||
| bias = bias ? THCTensor_(newContiguous)(state, bias) : bias; |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| } else if (gradBias != NULL) { | ||
| nOutputPlane = THCTensor_(size)(state, gradBias, 0); | ||
| } else { | ||
| return; |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| "expected for weight, but got: %s"); | ||
| THArgCheck(THCTensor_(isContiguous)(state, weight), 4, | ||
| "weight tensor has to be contiguous"); | ||
| THArgCheck(!bias || THCTensor_(isContiguous)(state, bias), 5, |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
apaszke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a full review. Just noticed a few things
| } else if (dim == 5) { | ||
| return at::thnn_conv_transpose3d( | ||
| input, weight, bias, | ||
| input, weight, kernel_size, bias, |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| "2D or 4D weight tensor expected, but got: %s"); | ||
| if (bias != NULL) { | ||
| THCUNN_check_dim_size(state, bias, 1, 0, weight->size[0]); | ||
| } |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| THCUNN_assertSameGPU(state, 5, input, gradOutput, gradWeight, gradBias, columns, ones); | ||
| if (gradWeight) { | ||
| THArgCheck(THCTensor_(isContiguous)(state, gradWeight), 4, "gradWeight needs to be contiguous"); | ||
| } |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
|
||
| int freeWeight = 0; | ||
| if (weight->nDimension == 4) { | ||
| if (weight && weight->nDimension == 4) { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@pytorchbot retest this please. |
|
@onnxbot add to whitelist |
|
@pytorchbot retest this please |
2 similar comments
|
@pytorchbot retest this please |
|
@pytorchbot retest this please |
|
Would really appreciate if someone can review this. There are a lot lines changed, but many of them are just similar changes on different THNN files. I also split into multiple commits to make it clearer. If needed, let me know how I can make this easier to review. :) |
|
looking |
Summary: Pull Request resolved: pytorch/glow#4812 if no compilation options are passed, default to c-step fixed the FC and batchmatmul implementations to match C-step fixed the fakelowp map calling to make sure we use the fp32 substitution of operators updated the accumulator test to make it pass with fp32 Test Plan: fakelowp tests glow/test/numerics net_runner Reviewed By: jfix71 Differential Revision: D23086534 fbshipit-source-id: 3fbb8c4055bb190becb39ce8cdff6671f8558734
output_nris not incremented properly inrebase_historyandset_historywhen some tensors are undefined. This causes the autograd engine incorrectly putting input tensors at wrong indices inInputBuffer. See the following extremely simple double backward example that reproduces the bug:Because
inputdoesn't require gradient, theggWandggbterms are set to incorrect indices, and it throws this weird error message:Why is this not detected in our tests: our double backward tests usually sets all parameters to
requires_grad=True. Therefore the case where a backward function call returns an undefined tensor is not tested.An issue has been submitted on testing with more diverse configurations: #4813.
Thanks @ezyang for helping me finding the cause.
Relevant forum post: https://discuss.pytorch.org/t/autograd-grad-dimension-error/12083/8