'border' and 'reflection’ modes of grid_sample have incorrect gradients at border

## 🐛 Bug

When `padding_mode='border'` in `grid_sample`, and a grid point falls exactly on the high boundary of the image (`size - 1`), the gradient should be based on the border padding scheme, which should give either the gradient from just inside the boundary, or zero from just outside the boundary (either could be valid, since it’s a non differentiable point). Instead, the gradient is currently based on zero padding the image, which gives wacky results.

Same problem occurs with `padding_mode='reflection'` for 2D `grid_sample` on CPU.
Reflection modes of both the cuda version and the 3D CPU version also have this problem, but it’s arguably worse, since the incorrect gradient is also negated. Furthermore, this is an inconsistency between the behavior of CPU and CUDA kernels.

## Example:
```python
image = torch.arange(0, 5, dtype=torch.float).expand((1,1,5,5)).requires_grad_()

id_grid = torch.nn.functional.affine_grid(
    torch.tensor([[[1,0,0],[0,1,0.]]]), (1,1,5,5), align_corners=True).requires_grad_()

torch.nn.functional.grid_sample(image, id_grid, padding_mode='border',
                                align_corners=True).sum().backward()

print(id_grid.grad.permute(0,3,1,2))
```
```python
tensor([[[[ 2.,  2.,  2.,  2., -8.],
          [ 2.,  2.,  2.,  2., -8.],
          [ 2.,  2.,  2.,  2., -8.],
          [ 2.,  2.,  2.,  2., -8.],
          [ 2.,  2.,  2.,  2., -8.]],

         [[ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  0.,  0.,  0.,  0.],
          [ 0., -2., -4., -6., -8.]]]])
```

Notice the wacky last row and last column. This is because the gradient there is currently calculated as if the image was zero-padded.

The result should ideally look like

```python
tensor([[[[ 2.,  2.,  2.,  2.,  2.],
          [ 2.,  2.,  2.,  2.,  2.],
          [ 2.,  2.,  2.,  2.,  2.],
          [ 2.,  2.,  2.,  2.,  2.],
          [ 2.,  2.,  2.,  2.,  2.]],

         [[ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  0.,  0.,  0.,  0.]]]])
```

which finds the gradient using the in-bounds neighbor.

A less ideal, but still palatable result would be

```python
tensor([[[[ 2.,  2.,  2.,  2.,  0.],
          [ 2.,  2.,  2.,  2.,  0.],
          [ 2.,  2.,  2.,  2.,  0.],
          [ 2.,  2.,  2.,  2.,  0.],
          [ 2.,  2.,  2.,  2.,  0.]],

         [[ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  0.,  0.,  0.,  0.],
          [ 0.,  0.,  0.,  0.,  0.]]]])
```

which finds the gradient using the out-of-bounds, border-padded neighbor.

Reflection mode on cpu (for instance, try using these same commands, but with `padding_mode='reflection'`) gives the exact same problematic result.
When using reflection mode on cuda, however, (as well as for 3D grid_sample on cpu) the problematic gradients are negated!

```python
tensor([[[[2., 2., 2., 2., 8.],
          [2., 2., 2., 2., 8.],
          [2., 2., 2., 2., 8.],
          [2., 2., 2., 2., 8.],
          [2., 2., 2., 2., 8.]],

         [[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [-0., 2., 4., 6., 8.]]]])
```

This is also problematic, of course, but even more so because of the mismatch between the cpu and cuda behaviors.

For `reflection` mode, I think it makes sense to set the gradient in such cases to zero, since it’s sort of at the apex of a symmetric hill. But setting it to take the gradient of one side or the other might also be acceptable for most practical purposes.

For `border` mode, by contrast, I think it makes more sense to always take the non-zero gradient from the inner side, since the outer side gradient will be zero and so effectively stop training (see the related discussion for `clamp` at #7002 and #7049).




PyTorch Version: tested on commit https://github.com/pytorch/pytorch/commit/0539462ca2966aa29657b58aeb17a85c21524d31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

'border' and 'reflection’ modes of grid_sample have incorrect gradients at border #23925

🐛 Bug

Example:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

'border' and 'reflection’ modes of grid_sample have incorrect gradients at border #23925

Description

🐛 Bug

Example:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions