Issue description
If a tensor with requires_grad=True is passed to mse_loss, then the loss is reduced even if reduction is none.
Appeared in Pytorch 0.4.1.
Code example
import torch
x = torch.zeros((4, 5, 2))
print('Good', torch.nn.functional.mse_loss(x, x, reduction='none').shape)
x.requires_grad = True
print('Bad', torch.nn.functional.mse_loss(x, x, reduction='none').shape)
Outputs:
Good torch.Size([4, 5, 2])
Bad torch.Size([])
System Info
Collecting environment information...
PyTorch version: 0.4.1
Is debug build: No
CUDA used to build PyTorch: 9.0.176
OS: Ubuntu 16.04.3 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.5) 5.4.0 20160609
CMake version: version 3.5.1
Python version: 3.6