Updates div to perform true division #42907

mruberry · 2020-08-12T08:52:44Z

This PR:

updates div to perform true division
makes torch.true_divide an alias of torch.div

This follows on work in previous PyTorch releases that first deprecated div performing "integer" or "floor" division, then prevented it by throwing a runtime error.

dr-ci · 2020-08-12T10:24:12Z

💊 CI failures summary and remediations

As of commit 7744f2b (more details on the Dr. CI page):

1/1 failures introduced in this PR

XLA failure

Job pytorch_xla_linux_bionic_py3_6_clang9_build is failing. Please create an issue with title prefixed by [PT_BREAK] in pytorch/xla and link to to this PR. If you have questions, please reach out to @ailzhang / @dlibenzi / @JackCaoG.

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 66 times.

mruberry · 2020-08-16T15:06:14Z

@ailzhang Looks like this change is breaking XLA because it changes torch.true_divide to be an alias for torch.div. It will also change the behavior of torch.div to always perform true division.

ailzhang · 2020-08-17T16:20:14Z

cc: @JackCaoG This requires updating torch.div behavior on xla side, would you mind taking a look?

mruberry · 2020-08-21T05:37:59Z

cc: @JackCaoG This requires updating torch.div behavior on xla side, would you mind taking a look?

ping @JackCaoG. We'd like to get this fix in sooner rather than later.

JackCaoG · 2020-08-21T05:40:14Z

@mruberry Sorry, missed this email. I will take a look tomorrow. Feel free to ping me on Slack if I didn't response to these kind of failures within a day 😄 , or open an issue on pt/xla which I check more frequently.

mruberry · 2020-08-21T05:41:21Z

@mruberry Sorry, missed this email. I will take a look tomorrow. Feel free to ping me on Slack if I didn't response to these kind of failures within a day 😄 .

haha, thanks @JackCaoG, but it's not "next day" urgent! It's like "by next week" urgent, if that works for you? Should be a small fix.

JackCaoG · 2020-08-21T05:44:08Z

@mruberry Sounds good, I will work on it tomorrow.

JackCaoG · 2020-08-21T21:08:51Z

@mruberry Currently I am seeing

>>> t1
tensor(10)
>>> t1/2
tensor(5.)
>>> t1/=2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: result type Float can't be cast to the desired output type Long

I think the intent was to make div_ also perform the true_divide and you will get rid of this runtime error in the following pr?

JackCaoG · 2020-08-22T02:58:20Z

@mruberry Could you rebase this pr 😄 ? I pined my pr to this pr but I need #43047 to build PT/XLA.

mruberry · 2020-08-23T01:16:46Z

@mruberry Could you rebase this pr 😄 ? I pined my pr to this pr but I need #43047 to build PT/XLA.

Thanks @JackCaoG - rebased!

torch/onnx/symbolic_opset9.py

JackCaoG · 2020-08-31T20:26:03Z

Hey @mruberry I am running into an issue where if I do

import torch, torch_xla                                                                                                                                                                                                                                     
import torch_xla.core.xla_model as xm                                                                                                                                                                                                                       
                                                                                                                                                                                                                                                            
device = xm.xla_device()                                                                                                                                                                                                                                    
t1 = torch.tensor(10, device=device)                                                                                                                                                                                                                        
print(torch.div(t1, 2.0))

the two tensor version of the divide at::Tensor AtenXlaType::div(const at::Tensor& self, const at::Tensor& other) will be called instead of the sclar version at::Tensor AtenXlaType::div(const at::Tensor& self, at::Scalar other) . const at::Tensor& other would have the dtype of float64 instead of the float32. Is this expected?

mruberry · 2020-08-31T20:27:52Z

Hey @mruberry I am running into an issue where if I do

import torch, torch_xla                                                                                                                                                                                                                                     
import torch_xla.core.xla_model as xm                                                                                                                                                                                                                       
                                                                                                                                                                                                                                                            
device = xm.xla_device()                                                                                                                                                                                                                                    
t1 = torch.tensor(10, device=device)                                                                                                                                                                                                                        
print(torch.div(t1, 2.0))

the two tensor version of the divide at::Tensor AtenXlaType::div(const at::Tensor& self, const at::Tensor& other) will be called instead of the sclar version at::Tensor AtenXlaType::div(const at::Tensor& self, at::Scalar other) . const at::Tensor& other would have the dtype of float64 instead of the float32. Is this expected?

No it's not. Let me do some local testing. I thought we had tests for this. I also need to rebase this.

mruberry · 2020-09-08T14:52:18Z

@JackCaoG I believe I fixed the issue you were seeing. This PR is also rebased - sorry there was a holdup.

aten/src/ATen/native/PointwiseOps.cpp

torch/_torch_docs.py

mruberry · 2020-09-08T17:38:36Z

aten/src/ATen/native/native_functions.yaml

+- func: true_divide.Tensor(Tensor self, Tensor other) -> Tensor
+  use_c10_dispatcher: full
+  variants: function, method
+  dispatch:


Does setting dispatch override the gradient? Check what test would catch this.

Delete dispatches.

Interesting follow-up: probably want to add gradcheck to test_op_aliases, method_test entries missing for true_divide and floor_divide.

Explicitly dispatching was causing autograd to fail. Deleting the dispatches works. Unfortunately aliases can't be added to method_tests() without additional test changes because test_jit.py has a check where it looks for the node, by name, but the node doesn't appear in the JIT's graph (because it's been aliased to another op). In the near future aliases will be refactored into OpInfos and gradcheck'd automatically there.

torch/_torch_docs.py

mruberry · 2020-09-08T17:47:29Z

torch/_torch_docs.py

+    Performs a "true" division like Python 3. See :func:`torch.floor_divide`
+    for floor division.
+
+Supports :ref:`broadcasting to a common shape <broadcasting-semantics>`,


TODO: check if scalar semantics are covered

The broadcasting note DOES NOT describe this, and the note is in need of an udpate. I added a note to our Docs Hackathon planner.

JackCaoG · 2020-09-08T23:58:04Z

@mruberry I am still seeing the same behavior, running

import torch, torch_xla                                                                                                                                                                                                         
import torch_xla.core.xla_model as xm                                                                                                                                                                                           
                                                                                                                                                                                                                                
device = xm.xla_device()                                                                                                                                                                                                        
t1 = torch.tensor(10, device=device)                                                                                                                                                                                            
print(torch.div(t1, 2.0))

give me

div two tensor
div two tensor
tensor(5., device='xla:0', dtype=torch.float64)

I am using the most recent div_true and I also rebased my pt/xla branch

(pytorch2) jackcao@jackcao64:~/Desktop/gitProj2/pytorch$ git status
On branch div_true
Your branch is up to date with 'origin/div_true'.

Could you give it another look?

mruberry · 2020-09-09T05:52:40Z

@mruberry I am using the pytorch head against the pytorch/xla head to run the same code and I am seeing the same behavior.
torch.div(t1, 2.0)
will call AtenXlaType::div(const at::Tensor& self, const at::Tensor& other) instead of AtenXlaType::div(const at::Tensor& self, at::Scalar other). This behavior is unrelated to this pr, but it does seems wrong.

I agree with you but I'm glad we've verified it's not a regression. Want to file a follow-up issue on the PyTorch/XLA side?

How does this PR look otherwise?

JackCaoG · 2020-09-09T19:24:04Z

@mruberry I am using the pytorch head against the pytorch/xla head to run the same code and I am seeing the same behavior.
torch.div(t1, 2.0)
will call AtenXlaType::div(const at::Tensor& self, const at::Tensor& other) instead of AtenXlaType::div(const at::Tensor& self, at::Scalar other). This behavior is unrelated to this pr, but it does seems wrong.
I agree with you but I'm glad we've verified it's not a regression. Want to file a follow-up issue on the PyTorch/XLA side?

How does this PR look otherwise?

This pr looks good. I need to fix/workaround the type promotion issue caused by the scalar->tensor call.

…true

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…true

mruberry · 2020-09-10T19:58:12Z

aten/src/ATen/native/TensorIterator.cpp

  }
 }

 TensorIterator TensorIterator::binary_op(Tensor& out, const Tensor& a,


binary_op_float

mruberry · 2020-09-10T19:59:43Z

torch/testing/_internal/common_methods_invocations.py

        ('div', (S, S, S), (uniform_scalar(0.1),), 'scalar_broadcast_rhs', (True,)),
        ('div', (), (uniform_scalar(0.1),), 'scalar_broadcast_lhs', (True,)),
        ('div', torch.rand(S, S, S) + 1e-1, (3.14,), 'constant', (True,)),
+        ('div', uniform_scalar(1e-1, requires_grad=True), (3.14,), 'scalar_constant', (True,)),


add true_divide and ignore in jit (for now) with comment

gchanan · 2020-09-11T19:29:21Z

test/test_type_promotion.py

-            self.assertEqual(div_result, dividend.clone().true_divide_(divisor))
-            self.assertEqual(dividend.clone().div_(2), dividend.clone().true_divide_(2))
+    def test_div_promotion_inplace(self, device, dtype):
+        for op in (torch.Tensor.div_, torch.Tensor.true_divide_):


would be nicer if you could just assert div_ is true_divide_ eh?

It would be. We could still set the functions equal in Python and C++.

mruberry · 2020-09-13T03:19:40Z

@JackCaoG I was planning to land this on Monday, that OK with you? I can start the land in the morning and I would expect it to land before noon.

JackCaoG · 2020-09-13T03:54:07Z

@JackCaoG I was planning to land this on Monday, that OK with you? I can start the land in the morning and I would expect it to land before noon.

Sounds good

mruberry · 2020-09-13T03:54:51Z

@JackCaoG I was planning to land this on Monday, that OK with you? I can start the land in the morning and I would expect it to land before noon.

Sounds good

Thanks @JackCaoG!

…true

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

mruberry · 2020-09-14T23:07:08Z

@JackCaoG This is merged (finally). Sorry it's later than expected due to some internal test slowdowns today.

JackCaoG · 2020-09-14T23:13:33Z

@mruberry thanks for the reminder, I merged xla side change too.

facebook-github-bot · 2020-09-15T00:15:03Z

@mruberry merged this pull request in 686e281.

Summary: This PR: - updates div to perform true division - makes torch.true_divide an alias of torch.div This follows on work in previous PyTorch releases that first deprecated div performing "integer" or "floor" division, then prevented it by throwing a runtime error. Pull Request resolved: #42907 Reviewed By: ngimel Differential Revision: D23622114 Pulled By: mruberry fbshipit-source-id: 414c7e3c1a662a6c3c731ad99cc942507d843927

mruberry requested a review from apaszke as a code owner August 12, 2020 08:52

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Aug 12, 2020

mruberry removed the request for review from apaszke August 13, 2020 03:39

JackCaoG mentioned this pull request Aug 21, 2020

Update div to perform true divide pytorch/xla#2452

Closed

mruberry changed the title ~~[WIP] Updates div to perform true division~~ Updates div to perform true division Aug 23, 2020

mruberry commented Aug 27, 2020

View reviewed changes

torch/onnx/symbolic_opset9.py Outdated Show resolved Hide resolved

mruberry mentioned this pull request Sep 2, 2020

PyTorch's div, which performs true division, cannot be exported to ONNX with consistent semantics #43787

Closed

updates

89340c4

mruberry force-pushed the div_true branch from b02a6dd to 89340c4 Compare September 8, 2020 12:54

mruberry requested a review from gchanan September 8, 2020 15:12