Skip to content

Conversation

@v0dro
Copy link
Contributor

@v0dro v0dro commented Jul 24, 2020

Fixes #33394 .

This PR does two things:

  1. Implement CUDA scatter reductions with revamped GPU atomic operations.
  2. Remove support for divide and subtract for CPU reduction as was discussed with @ngimel .

I've also updated the docs to reflect the existence of only multiply and add.

@dr-ci
Copy link

dr-ci bot commented Jul 24, 2020

💊 CI failures summary and remediations

As of commit d6dd7b0 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 68 times.

@rgommers rgommers requested a review from aocsa July 24, 2020 10:02
Copy link
Contributor

@aocsa aocsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! @v0dro, I really enjoyed reading/reviewing this PR about scatter reduce on CUDA. I put a couple of general comments and questions.

@rgommers rgommers changed the title Scatter reduce cuda Implement scatter reductions (CUDA), remove divide/subtract Jul 25, 2020
@ngimel ngimel self-requested a review July 26, 2020 18:17
@ngimel ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 26, 2020
@ngimel
Copy link
Collaborator

ngimel commented Sep 12, 2020

xla error is real, can you please skip part of the test checking for runtime error on xla? Also, rocm build interestingly times out on scatter test, so it's hard to say if it's related or not.

@ngimel
Copy link
Collaborator

ngimel commented Sep 16, 2020

Hm, rocm build times out on a scatter test, that's worrying. Can you disable it on rocm?

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@v0dro
Copy link
Contributor Author

v0dro commented Sep 17, 2020

OK everything passing except the facebook internal tests.

@facebook-github-bot
Copy link
Contributor

@ngimel merged this pull request in e18a221.

xuzhao9 pushed a commit that referenced this pull request Sep 18, 2020
Summary:
Fixes #33394 .

This PR does two things:
1. Implement CUDA scatter reductions with revamped GPU atomic operations.
2. Remove support for divide and subtract for CPU reduction as was discussed with ngimel .

I've also updated the docs to reflect the existence of only multiply and add.

Pull Request resolved: #41977

Reviewed By: mruberry

Differential Revision: D23748888

Pulled By: ngimel

fbshipit-source-id: ea643c0da03c9058e433de96db02b503514c4e9c
@malfet
Copy link
Contributor

malfet commented Sep 18, 2020

Why changes to submodules were pulled as part of this PR? (namely, it reverted #44706)

malfet pushed a commit to malfet/pytorch that referenced this pull request Sep 18, 2020
Summary:
- Bump oneDNN (mkl-dnn) to 1.6 for bug fixes
    - Fixes pytorch#42446. RuntimeError: label is redefined for convolutions with large filter size on Intel AVX512
    - Implemented workaround for internal compiler error when building oneDNN with Microsoft Visual Studio 2019 (pytorch#43169)

Restore pytorch#44706 which was reverted by pytorch#41977
@ngimel ngimel mentioned this pull request Sep 19, 2020
facebook-github-bot pushed a commit that referenced this pull request Sep 19, 2020
Summary:
Restore #44706, which should workaround VC compiler crash, which was reverted by #41977
Update configs to use ":stable" Windows images

Pull Request resolved: #44746

Reviewed By: walterddr

Differential Revision: D23793682

Pulled By: malfet

fbshipit-source-id: bfdc36c35b920f58798a18c15642ec7efc68f00e
facebook-github-bot pushed a commit that referenced this pull request Sep 19, 2020
Summary:
Revert accidental gloo submodule changes in #41977

Pull Request resolved: #45008

Reviewed By: malfet

Differential Revision: D23799892

Pulled By: ngimel

fbshipit-source-id: e8dab244c6abad32ed60efe3c26cab40837e57c8
@sbb-gh
Copy link

sbb-gh commented Nov 3, 2020

Is there an option to ignore/sum over several of the output dimensions?
https://discuss.pytorch.org/t/scatter-add-reduce-output-dimensions-shape/100427

mikaylagawarecki added a commit that referenced this pull request Mar 23, 2022
Noticed that `cuda_atomic_ops_test` wasn't added to `run_test.sh` and hence hasn't been running in CI since it was added in #41977.

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Mar 24, 2022
Noticed that `cuda_atomic_ops_test` wasn't added to `run_test.sh` and hence hasn't been running in CI since it was added in #41977.

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Mar 24, 2022
Noticed that `cuda_atomic_ops_test` wasn't added to `run_test.sh` and hence hasn't been running in CI since it was added in #41977.

[ghstack-poisoned]
mikaylagawarecki added a commit that referenced this pull request Mar 24, 2022
Noticed that `cuda_atomic_ops_test` wasn't added to `run_test.sh` and hence hasn't been running in CI since it was added in #41977.

[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Non-deterministic scatter reduction algorithms for scatter operations for CUDA (sum, subtract, divide, multiply).

8 participants