Skip to content

Conversation

@bdhirsh
Copy link
Contributor

@bdhirsh bdhirsh commented Sep 9, 2020

PR for issue: #16357

Stack from ghstack:

Not entirely sure why, but changing the type of beta from float to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

Differential Revision: D23636720

Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Sep 9, 2020
Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

ghstack-source-id: 90a3b08
Pull Request resolved: #44433
@dr-ci
Copy link

dr-ci bot commented Sep 9, 2020

💊 CI failures summary and remediations

As of commit fd6d110 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

XLA failure

Job pytorch_xla_linux_bionic_py3_6_clang9_build is failing. Please create an issue with title prefixed by [PT_BREAK] in pytorch/xla and link to to this PR. If you have questions, please reach out to @ailzhang / @dlibenzi / @JackCaoG.


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 176 times.

Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Sep 10, 2020
Not entirely sure why, but changing the type of beta from `float` to
`double in autocast_mode.cpp and FunctionsManual.h fixes my compiler
errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

updated the smooth_l1_loss signature in the torch api, added a torch
test

ghstack-source-id: ad6153e
Pull Request resolved: #44433
Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Sep 10, 2020
Not entirely sure why, but changing the type of beta from `float` to
`double in autocast_mode.cpp and FunctionsManual.h fixes my compiler
errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

updated the smooth_l1_loss signature in the torch api, added a torch
test

added a cpp_api_parity test to test the actual kernel calls

ghstack-source-id: 9686540
Pull Request resolved: #44433
@bdhirsh bdhirsh requested a review from mrshenli September 10, 2020 15:43
return norm_val * grad_output;
else
return norm_val * x * grad_output;
return norm_val * x * grad_output / beta;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the smooth l1 loss when |X| < 1 now divides by the beta parameter, so the derivative does as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong. If it divides by beta in the forward pass, should we multiply the grad by beta in the backward pass?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted with @bdhirsh offline. I was wrong. Brian's current implementation is correct.

@bdhirsh
Copy link
Contributor Author

bdhirsh commented Sep 10, 2020

Making beta a double rather than a Scalar, since it's used in comparison- something that doesn't really make sense for complex types. See comment here (sorry for the duplicate PR) #44197 (review)

namespace native {

using pointwise_fn = void (*)(TensorIterator&, Scalar scalar);
using pointwise_fn_beta = void (*)(TensorIterator&, Scalar scalar, double beta);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lmk if you think the name should be more general, or if I just shouldn't use an alias here. Right now smooth_l1_backwards looks like the only point wise op here that takes in two scalar params

inline Tensor _smooth_l1_loss(const Tensor& input, const Tensor& target, double beta = 1.) {
auto t = torch::abs(input - target);
return torch::where(t < 1, 0.5 * torch::pow(t, 2), t - 0.5);
return torch::where(t < beta, 0.5 * torch::pow(t, 2) / beta, t - 0.5 * beta);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not 100% clear to me if I need to change this, since this seems independent of the kernel code


Tensor smooth_l1_loss_double_backward(const Tensor & grad, const Tensor & input, const Tensor & target, int64_t reduction) {
Tensor smooth_l1_loss_double_backward(const Tensor & grad, const Tensor & input, const Tensor & target, int64_t reduction, double beta) {
// TODO: figure out where to use beta
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not 100% on how beta should be used in the double_backwards of smooth_l1_loss

Copy link
Contributor

@mrshenli mrshenli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

We might also need to change the SmoothL1Loss class to use this new function API, and update its doc string as well. This can come in a followup PR.

Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

[ghstack-poisoned]
Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Sep 10, 2020
Not entirely sure why, but changing the type of beta from `float` to
`double in autocast_mode.cpp and FunctionsManual.h fixes my compiler
errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

updated the smooth_l1_loss signature in the torch api, added a torch
test

added a cpp_api_parity test to test the actual kernel calls

Updating the python API + docs

ghstack-source-id: 06cc809
Pull Request resolved: #44433
Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

Differential Revision: [D23636720](https://our.internmc.facebook.com/intern/diff/D23636720)

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Sep 11, 2020
Not entirely sure why, but changing the type of beta from `float` to
`double in autocast_mode.cpp and FunctionsManual.h fixes my compiler
errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

updated the smooth_l1_loss signature in the torch api, added a torch
test

added a cpp_api_parity test to test the actual kernel calls

Updating the python API + docs

some test fixes

ghstack-source-id: 48e59a6
Pull Request resolved: #44433
Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

Differential Revision: [D23636720](https://our.internmc.facebook.com/intern/diff/D23636720)

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Sep 11, 2020
Not entirely sure why, but changing the type of beta from `float` to
`double in autocast_mode.cpp and FunctionsManual.h fixes my compiler
errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

updated the smooth_l1_loss signature in the torch api, added a torch
test

added a cpp_api_parity test to test the actual kernel calls

Updating the python API + docs

some test fixes

fix linter errors

ghstack-source-id: 0e2c279
Pull Request resolved: #44433
@codecov
Copy link

codecov bot commented Sep 11, 2020

Codecov Report

Merging #44433 into gh/bdhirsh/8/base will decrease coverage by 0.06%.
The diff coverage is n/a.

Impacted file tree graph

@@                  Coverage Diff                  @@
##           gh/bdhirsh/8/base   #44433      +/-   ##
=====================================================
- Coverage              68.04%   67.98%   -0.07%     
=====================================================
  Files                    384      384              
  Lines                  49814    49596     -218     
=====================================================
- Hits                   33897    33718     -179     
+ Misses                 15917    15878      -39     
Impacted Files Coverage Δ
torch/_vmap_internals.py 20.98% <0.00%> (-77.78%) ⬇️
torch/quantization/quantize_fx.py 90.56% <0.00%> (-7.55%) ⬇️
torch/utils/_benchmark/utils/timer.py 85.18% <0.00%> (-3.71%) ⬇️
torch/fx/symbolic_trace.py 93.69% <0.00%> (-1.66%) ⬇️
torch/onnx/symbolic_registry.py 70.49% <0.00%> (-1.64%) ⬇️
torch/nn/utils/prune.py 92.26% <0.00%> (-1.18%) ⬇️
torch/quantization/fx/quantize.py 96.31% <0.00%> (-0.99%) ⬇️
torch/testing/_internal/common_distributed.py 67.71% <0.00%> (-0.43%) ⬇️
.../testing/_internal/distributed/rpc/jit/rpc_test.py 29.23% <0.00%> (-0.42%) ⬇️
torch/utils/_benchmark/utils/common.py 77.31% <0.00%> (-0.38%) ⬇️
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5cc151...8f107e2. Read the comment docs.

Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

Differential Revision: [D23636720](https://our.internmc.facebook.com/intern/diff/D23636720)

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Sep 17, 2020
Not entirely sure why, but changing the type of beta from `float` to
`double in autocast_mode.cpp and FunctionsManual.h fixes my compiler
errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

updated the smooth_l1_loss signature in the torch api, added a torch
test

added a cpp_api_parity test to test the actual kernel calls

Updating the python API + docs

some test fixes

fix linter errors

fixing double backwards fn

fixing smooth_l1_loss_out to update memory in place correctly

removing TODOs

kernel fix- casting beta to the same scalar type as the tensor to
prevent unnecessary type conversions (e.g. when we have a tensor of
floats)

fixing test

ghstack-source-id: c05ede8
Pull Request resolved: #44433
@JackCaoG
Copy link
Collaborator

@JackCaoG Hi Jack, I think that this PR requires a change on the XLA side- I'm updating the existing smooth_l1_loss operator to take in an optional beta param, and I noticed a failing XLA build from the PR: https://app.circleci.com/pipelines/github/pytorch/pytorch/213829/workflows/0fb2f612-971c-4fbf-aca5-974a1d85a11b/jobs/7459944.

AtenXlaType function missed override: Tensor smooth_l1_loss(const Tensor& self, const Tensor& target, int64_t reduction);

Let me know if I can help!

Thanks for the heads-up, I will make the corresponding change on the XLA side.

Copy link
Contributor

@mrshenli mrshenli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the concern of division by 0 is a real, let's add a test to check where |x - y| == 0

Tensor smooth_l1_loss_double_backward(const Tensor & grad, const Tensor & input, const Tensor & target, int64_t reduction, double beta) {
auto d = (input - target).abs();
auto grad_input = grad * (d < 1).type_as(grad);
auto grad_input = grad * (d < beta).type_as(grad) / beta;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can beta be 0 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, good catch. I'll try rewriting it in a way to prevent divide by zeros.

return norm_val * grad_output;
else
return norm_val * x * grad_output;
return norm_val * x * grad_output / beta;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like if x == beta == 0, we could reach here, and then trigger division by 0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might miss sth, but looks like this is different from the match given in the doc.

0.5 (x_i - y_i)^2 / beta, & \text{if } |x_i - y_i| < beta 

Should the three condition be x <= -beta, x >= beta, and otherwise?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep you're right, thanks. And using x <= -beta and x >= beta should fix the division by 0 issue.

output = ge_one_mask * (abs_diff - 0.5) + lt_one_mask * 0.5 * (abs_diff ** 2)
ge_beta_mask = (abs_diff >= beta).type_as(abs_diff)
lt_beta_mask = (abs_diff < beta).type_as(abs_diff)
output = ge_beta_mask * (abs_diff - 0.5 * beta) + lt_beta_mask * 0.5 * (abs_diff ** 2) / beta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also need to handle divide by 0 case?

__constants__ = ['reduction']

def __init__(self, size_average=None, reduce=None, reduction: str = 'mean') -> None:
beta: float
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes beta a class attribute instead of an instance attribute?

PR for issue: #16357


Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

Differential Revision: [D23636720](https://our.internmc.facebook.com/intern/diff/D23636720)

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Sep 18, 2020
Not entirely sure why, but changing the type of beta from `float` to
`double in autocast_mode.cpp and FunctionsManual.h fixes my compiler
errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

updated the smooth_l1_loss signature in the torch api, added a torch
test

added a cpp_api_parity test to test the actual kernel calls

Updating the python API + docs

some test fixes

fix linter errors

fixing double backwards fn

fixing smooth_l1_loss_out to update memory in place correctly

removing TODOs

kernel fix- casting beta to the same scalar type as the tensor to
prevent unnecessary type conversions (e.g. when we have a tensor of
floats)

fixing test

fixing divide-by-zero issues

ghstack-source-id: 993e5df
Pull Request resolved: #44433
PR for issue: #16357


Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

Differential Revision: [D23636720](https://our.internmc.facebook.com/intern/diff/D23636720)

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Sep 18, 2020
Not entirely sure why, but changing the type of beta from `float` to
`double in autocast_mode.cpp and FunctionsManual.h fixes my compiler
errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

updated the smooth_l1_loss signature in the torch api, added a torch
test

added a cpp_api_parity test to test the actual kernel calls

Updating the python API + docs

some test fixes

fix linter errors

fixing double backwards fn

fixing smooth_l1_loss_out to update memory in place correctly

removing TODOs

kernel fix- casting beta to the same scalar type as the tensor to
prevent unnecessary type conversions (e.g. when we have a tensor of
floats)

fixing test

fixing divide-by-zero issues

ghstack-source-id: f050591
Pull Request resolved: #44433
pickle=False)


def smoothl1loss_zero_beta_test():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be added to new_module_tests?

lambda i: F.smooth_l1_loss(i, t.type_as(i), reduction='none', beta=0)),
cpp_function_call='''F::smooth_l1_loss(
i, t.to(i.options()), F::SmoothL1LossFuncOptions().reduction(torch::kNone), 0)''',
input_fn=lambda: torch.randn(2, 3, 4),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this won't trigger zero loss, right? If we want to test zero loss, will using t.clone() as input work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  1. I started trying to modify the backwards kernels to avoid/check for divide-by-zeros, but I realized that setting beta=0 is really just the same thing as using l1_loss. I updated the definitions in Loss.cpp to check for that case and just call out to their corresponding l1_loss functions directly.

  2. I added a test to the suite that uses beta=0, but I'm not testing for the case that the two tensors are the same. That's because it doesn't look like that test will play well with this test suite. The test suite asserts the logic by computing the analytical jacobian of the loss (using my backwards) as well as the numerical jacobian (using (f(x+e)-f(x-e)/(2*e))), but that logic breaks down when one of the functions isn't continuous. In this case, the second derivative of smooth_l1_loss when beta=0 isn't continuous- it's equal to infinity at x=0, and 0 everywhere else.

return norm_val * grad_output;
else
return norm_val * x * grad_output;
return norm_val * x * grad_output / beta;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong. If it divides by beta in the forward pass, should we multiply the grad by beta in the backward pass?

lt_beta_mask = (abs_diff < beta).type_as(abs_diff)
# guard against divide-by-zero when beta is zero.
if beta == 0:
output = abs_diff - 0.5 * beta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since beta is 0, we can remove the 0.5 * beta from the formula?

PR for issue: #16357


Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

Differential Revision: [D23636720](https://our.internmc.facebook.com/intern/diff/D23636720)

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Sep 23, 2020
Not entirely sure why, but changing the type of beta from `float` to
`double in autocast_mode.cpp and FunctionsManual.h fixes my compiler
errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

updated the smooth_l1_loss signature in the torch api, added a torch
test

added a cpp_api_parity test to test the actual kernel calls

Updating the python API + docs

some test fixes

fix linter errors

fixing double backwards fn

fixing smooth_l1_loss_out to update memory in place correctly

removing TODOs

kernel fix- casting beta to the same scalar type as the tensor to
prevent unnecessary type conversions (e.g. when we have a tensor of
floats)

fixing test

fixing divide-by-zero issues - when beta <= 0 we should just call out
directly to l1_loss

ghstack-source-id: 81905de
Pull Request resolved: #44433
PR for issue: #16357


Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

Differential Revision: [D23636720](https://our.internmc.facebook.com/intern/diff/D23636720)

[ghstack-poisoned]
PR for issue: #16357


Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

Differential Revision: [D23636720](https://our.internmc.facebook.com/intern/diff/D23636720)

[ghstack-poisoned]
bdhirsh added a commit that referenced this pull request Sep 23, 2020
Not entirely sure why, but changing the type of beta from `float` to
`double in autocast_mode.cpp and FunctionsManual.h fixes my compiler
errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

updated the smooth_l1_loss signature in the torch api, added a torch
test

added a cpp_api_parity test to test the actual kernel calls

Updating the python API + docs

some test fixes

fix linter errors

fixing double backwards fn

fixing smooth_l1_loss_out to update memory in place correctly

removing TODOs

kernel fix- casting beta to the same scalar type as the tensor to
prevent unnecessary type conversions (e.g. when we have a tensor of
floats)

fixing test

fixing divide-by-zero issues - when beta <= 0 we should just call out
directly to l1_loss

ghstack-source-id: dd5afc7
Pull Request resolved: #44433
@bdhirsh
Copy link
Contributor Author

bdhirsh commented Sep 23, 2020

Hey @JackCaoG, I see that the PR on the XLA side has approval, so I'm going to start trying to merge this soon. Thanks for helping!

@JackCaoG
Copy link
Collaborator

Hey @JackCaoG, I see that the PR on the XLA side has approval, so I'm going to start trying to merge this soon. Thanks for helping!

sounds good!

Copy link
Contributor

@mrshenli mrshenli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Let's land after the XLA PR, so that we won't break master.

@JackCaoG
Copy link
Collaborator

LGTM! Let's land after the XLA PR, so that we won't break master.

You guys can go first, in xla side I can simply merge the pr after this one merged(I think pytorch merge takes some time).

@facebook-github-bot
Copy link
Contributor

@bdhirsh merged this pull request in 439930c.

@facebook-github-bot facebook-github-bot deleted the gh/bdhirsh/8/head branch September 29, 2020 14:23
python_module: nn

- func: smooth_l1_loss_backward.grad_input(Tensor grad_output, Tensor self, Tensor target, int reduction, *, Tensor(a!) grad_input) -> Tensor(a!)
- func: smooth_l1_loss_backward.grad_input(Tensor grad_output, Tensor self, Tensor target, int reduction, float beta=1.0, *, Tensor(a!) grad_input) -> Tensor(a!)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you shouldn't need to provide a default parameter -- we don't expose these backwards functions to python, they should only be called via autograd. And you'll notice that reduction doesn't have a default parameter.

The parts I'm not sure about here are if TorchScript would let you serialize this function (it shouldn't) and/or if the backwards compatibility test would complain.

\begin{cases}
0.5 (x_i - y_i)^2, & \text{if } |x_i - y_i| < 1 \\
|x_i - y_i| - 0.5, & \text{otherwise }
0.5 (x_i - y_i)^2 / beta, & \text{if } |x_i - y_i| < beta \\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the above isn't accurate anymore ""Creates a criterion that uses a squared term if the absolute
element-wise error falls below 1 and an L1 term otherwise. element-wise error falls below 1 and an L1 term otherwise."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also document the beta < 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants