Skip to content

Conversation

@mikaylagawarecki
Copy link
Contributor

@mikaylagawarecki mikaylagawarecki commented Mar 15, 2022

Stack from ghstack:

Update signature of scatter_reduce_ to match scatter_/scatter_add_

Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)

  • Add new reduction options in ScatterGatherKernel.cpp and update scatter_reduce to call into the cpu kernel for scatter.reduce
  • scatter_reduce now has the same shape constraints as scatter_ and scatter_add_
  • Migrate test/test_torch.py:test_scatter_reduce to test/test_scatter_gather_ops.py

Differential Revision: D35222842

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 15, 2022

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/4d1d5d9354f3aeaf58f245ba8477e3360dd07193/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-manywheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build ciflow/all, ciflow/cpu, ciflow/default, ciflow/libtorch, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-libtorch-debug ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-libtorch-release ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-wheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk, ciflow/xla 🚫 skipped

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 15, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 241b898 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build trunk / linux-bionic-rocm4.5-py3.7-distributed / test (distributed, 1, 1, linux.rocm.gpu) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-31T22:21:35.6736775Z AssertionError: 2 unit test(s) failed:
2022-03-31T22:21:34.6599020Z 
2022-03-31T22:21:34.6599263Z OK (skipped=1)
2022-03-31T22:21:34.6599642Z 
2022-03-31T22:21:34.6599926Z Generating XML reports...
2022-03-31T22:21:34.6687770Z Generated XML report: test-reports/dist-nccl/distributed.test_distributed_spawn/TEST-TestDistBackendWithSpawn-20220331222131.xml
2022-03-31T22:21:35.6729963Z Traceback (most recent call last):
2022-03-31T22:21:35.6730948Z   File "distributed/test_distributed_spawn.py", line 40, in <module>
2022-03-31T22:21:35.6731726Z     run_tests()
2022-03-31T22:21:35.6733198Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 634, in run_tests
2022-03-31T22:21:35.6735981Z     len(failed_tests), '\n\t'.join(failed_tests))
2022-03-31T22:21:35.6736775Z AssertionError: 2 unit test(s) failed:
2022-03-31T22:21:35.6738216Z 	TestDistBackendWithSpawn.test_post_localSGD_optimizer_parity_with_hierarchical_sgd
2022-03-31T22:21:35.6739575Z 	TestDistBackendWithSpawn.test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view
2022-03-31T22:21:36.7411275Z Traceback (most recent call last):
2022-03-31T22:21:36.7412561Z   File "test/run_test.py", line 1054, in <module>
2022-03-31T22:21:36.7422270Z     main()
2022-03-31T22:21:36.7422936Z   File "test/run_test.py", line 1032, in main
2022-03-31T22:21:36.7425457Z     raise RuntimeError(err_message)
2022-03-31T22:21:36.7427261Z RuntimeError: distributed/test_distributed_spawn failed!
2022-03-31T22:21:37.8954539Z 
2022-03-31T22:21:37.8955305Z real	62m37.143s

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

mikaylagawarecki added a commit that referenced this pull request Mar 15, 2022
ghstack-source-id: 236567f
Pull Request resolved: #74226
mikaylagawarecki added a commit that referenced this pull request Mar 16, 2022
ghstack-source-id: 3a2bf75
Pull Request resolved: #74226
Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce, *, bool include_input=True)`

- Update `scatter_reduce` to call into the cpu/cuda kernels for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Add an argument `include_input` which indicates whether the value in the `self` Tensor at a given position is included in the reduction with the elements from `src` scattered to that position. For
`I_self = {all indices of self}`
`I_src= {all indices of src}`
`S = {indices of self modified by scatter}`
`self_indices_to_src_indices : I_self --> I_src` maps indices in `self` to a tuple of indices in `src` scattered to that index of `self`
Then  for `s ∈ S` and `t ∈ I\S` when `include_input=False`
`self[s] = reduction_op(src[self_indices_to_src_indices[s]])`
`self[t] = self[t]`
and when `include_input=True` (regular scatter(reduce=op) behavior)
`self[s] = reduction_op(self[s], src[self_indices_to_src_indices[s]])`
`self[t] = self[t]`

The [`optional_out` case of pytorch_scatter.scatter](https://github.com/rusty1s/pytorch_scatter/blob/master/csrc/scatter.cpp#L32) can then be handled by 
`torch.zeros(shape).scatter_reduce_(dim, index, src, reduce, include_input=False)`

[ghstack-poisoned]
Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce, *, bool include_input=True)`

- Update `scatter_reduce` to call into the cpu/cuda kernels for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Add an argument `include_input` which indicates whether the value in the `self` Tensor at a given position is included in the reduction with the elements from `src` scattered to that position. For
`I_self = {all indices of self}`
`I_src= {all indices of src}`
`S = {indices of self modified by scatter}`
`self_indices_to_src_indices : I_self --> I_src` maps indices in `self` to a tuple of indices in `src` scattered to that index of `self`
Then  for `s ∈ S` and `t ∈ I\S` when `include_input=False`
`self[s] = reduction_op(src[self_indices_to_src_indices[s]])`
`self[t] = self[t]`
and when `include_input=True` (regular scatter(reduce=op) behavior)
`self[s] = reduction_op(self[s], src[self_indices_to_src_indices[s]])`
`self[t] = self[t]`

The [`optional_out` case of pytorch_scatter.scatter](https://github.com/rusty1s/pytorch_scatter/blob/master/csrc/scatter.cpp#L32) can then be handled by 
`torch.zeros(shape).scatter_reduce_(dim, index, src, reduce, include_input=False)`

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Mar 17, 2022
ghstack-source-id: 12c1ed8
Pull Request resolved: #74226
Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce, *, bool include_input=True)`

- Update `scatter_reduce` to call into the cpu/cuda kernels for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Add an argument `include_input` which indicates whether the value in the `self` Tensor at a given position is included in the reduction with the elements from `src` scattered to that position. For
`I_self = {all indices of self}`
`I_src= {all indices of src}`
`S = {indices of self modified by scatter}`
`self_indices_to_src_indices : I_self --> I_src` maps indices in `self` to a tuple of indices in `src` scattered to that index of `self`
Then  for `s ∈ S` and `t ∈ I\S` when `include_input=False`
`self[s] = reduction_op(src[self_indices_to_src_indices[s]])`
`self[t] = self[t]`
and when `include_input=True` (regular scatter(reduce=op) behavior)
`self[s] = reduction_op(self[s], src[self_indices_to_src_indices[s]])`
`self[t] = self[t]`

The [`optional_out` case of pytorch_scatter.scatter](https://github.com/rusty1s/pytorch_scatter/blob/master/csrc/scatter.cpp#L32) can then be handled by 
`torch.zeros(shape).scatter_reduce_(dim, index, src, reduce, include_input=False)`

[ghstack-poisoned]
Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce, *, bool include_input=True)`

- Update `scatter_reduce` to call into the cpu/cuda kernels for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Add an argument `include_input` which indicates whether the value in the `self` Tensor at a given position is included in the reduction with the elements from `src` scattered to that position. For
`I_self = {all indices of self}`
`I_src= {all indices of src}`
`S = {indices of self modified by scatter}`
`self_indices_to_src_indices : I_self --> I_src` maps indices in `self` to a tuple of indices in `src` scattered to that index of `self`
Then  for `s ∈ S` and `t ∈ I\S` when `include_input=False`
`self[s] = reduction_op(src[self_indices_to_src_indices[s]])`
`self[t] = self[t]`
and when `include_input=True` (regular scatter(reduce=op) behavior)
`self[s] = reduction_op(self[s], src[self_indices_to_src_indices[s]])`
`self[t] = self[t]`

The [`optional_out` case of pytorch_scatter.scatter](https://github.com/rusty1s/pytorch_scatter/blob/master/csrc/scatter.cpp#L32) can then be handled by 
`torch.zeros(shape).scatter_reduce_(dim, index, src, reduce, include_input=False)`

[ghstack-poisoned]
@mikaylagawarecki mikaylagawarecki marked this pull request as ready for review March 17, 2022 08:21
auto index_stride = self_dim_stride;


AT_DISPATCH_FLOATING_TYPES_AND2(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the only difference between this kernel and cuda_scatter_gather_base_kernel is
AT_DISPATCH_FLOATING_TYPES_AND2 vs AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND3 it might be easier to explicitly check for a specific scalar type and error out in a unified kernel than copy-pasting the code and just changing this line.

The error message to throw can be found in the dispatch macro definitions.

Copy link
Contributor

@cpuhrsch cpuhrsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, but I'd try to avoid the code duplication of copy-pasting cpu_scatter_gather_base_kernel to deal with the different set of dtypes.

@cpuhrsch cpuhrsch changed the title Use ScatterGatherKernel for scatter_reduce (CPU-only) [BC-breaking] Use ScatterGatherKernel for scatter_reduce (CPU-only) Mar 24, 2022
@cpuhrsch cpuhrsch added release notes: sparse release notes category topic: bc breaking topic category topic: new features topic category labels Mar 24, 2022
cpu_scatter_gather_base_kernel<>()(self, dim, index, value,
"scatter_scalar_reduce_multiply_", reduce_multiply);
break;
default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you add this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this the build on CI will fail. I think it's due to this

Using the -Werror compiler flag, a switch statement over a value of an enum type without a default label will fail to compile if any enumerator of the enum doesn’t have a corresponding case. This is sometimes called an exhaustive or defaultless switch statement.

Copy link
Contributor

@cpuhrsch cpuhrsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just added two small comments

…CPU-only)"


Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)`

- Add new reduction options in ScatterGatherKernel.cpp and update `scatter_reduce` to call into the cpu kernel for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Migrate `test/test_torch.py:test_scatter_reduce` to `test/test_scatter_gather_ops.py`

[ghstack-poisoned]
@mikaylagawarecki mikaylagawarecki removed the request for review from bdhirsh March 25, 2022 22:04
@mikaylagawarecki
Copy link
Contributor Author

@mikaylagawarecki has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…CPU-only)"


Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)`

- Add new reduction options in ScatterGatherKernel.cpp and update `scatter_reduce` to call into the cpu kernel for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Migrate `test/test_torch.py:test_scatter_reduce` to `test/test_scatter_gather_ops.py`

Differential Revision: [D35222842](https://our.internmc.facebook.com/intern/diff/D35222842)

[ghstack-poisoned]
@mikaylagawarecki
Copy link
Contributor Author

@mikaylagawarecki has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…CPU-only)"


Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)`

- Add new reduction options in ScatterGatherKernel.cpp and update `scatter_reduce` to call into the cpu kernel for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Migrate `test/test_torch.py:test_scatter_reduce` to `test/test_scatter_gather_ops.py`

Differential Revision: [D35222842](https://our.internmc.facebook.com/intern/diff/D35222842)

[ghstack-poisoned]
@mikaylagawarecki
Copy link
Contributor Author

@mikaylagawarecki has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…CPU-only)"


Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)`

- Add new reduction options in ScatterGatherKernel.cpp and update `scatter_reduce` to call into the cpu kernel for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Migrate `test/test_torch.py:test_scatter_reduce` to `test/test_scatter_gather_ops.py`

Differential Revision: [D35222842](https://our.internmc.facebook.com/intern/diff/D35222842)

[ghstack-poisoned]
@mikaylagawarecki
Copy link
Contributor Author

@mikaylagawarecki has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…CPU-only)"


Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)`

- Add new reduction options in ScatterGatherKernel.cpp and update `scatter_reduce` to call into the cpu kernel for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Migrate `test/test_torch.py:test_scatter_reduce` to `test/test_scatter_gather_ops.py`

Differential Revision: [D35222842](https://our.internmc.facebook.com/intern/diff/D35222842)

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Mar 31, 2022
ghstack-source-id: c597755
Pull Request resolved: #74226
@mikaylagawarecki
Copy link
Contributor Author

@mikaylagawarecki has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…CPU-only)"


Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)`

- Add new reduction options in ScatterGatherKernel.cpp and update `scatter_reduce` to call into the cpu kernel for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Migrate `test/test_torch.py:test_scatter_reduce` to `test/test_scatter_gather_ops.py`

Differential Revision: [D35222842](https://our.internmc.facebook.com/intern/diff/D35222842)

[ghstack-poisoned]
@mikaylagawarecki
Copy link
Contributor Author

@mikaylagawarecki has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Apr 1, 2022
…74226)

Summary:
Pull Request resolved: #74226

Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)`

- Add new reduction options in ScatterGatherKernel.cpp and update `scatter_reduce` to call into the cpu kernel for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Migrate `test/test_torch.py:test_scatter_reduce` to `test/test_scatter_gather_ops.py`

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D35222842

Pulled By: mikaylagawarecki

fbshipit-source-id: 84930add2ad30baf872c495251373313cb7428bd
@facebook-github-bot facebook-github-bot deleted the gh/mikaylagawarecki/44/head branch April 4, 2022 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants