Skip to content

Conversation

@pritamdamania87
Copy link
Contributor

@pritamdamania87 pritamdamania87 commented Aug 19, 2020

Stack from ghstack:

torch.cuda.nccl APIs didn't throw appropriate errors when called
with inputs/outputs that were of the wrong type and it resulted in some cryptic
errors instead.

Adding some error checks with explicit error messages for these APIs.

Differential Revision: D23206069

`torch.cuda.nccl` APIs didn't throw appropriate errors when called
with inputs/outputs that were of the wrong type and it resulted in some cryptic
errors instead.

Adding some error checks with explicit error messages for these APIs.

Differential Revision: [D23206069](https://our.internmc.facebook.com/intern/diff/D23206069/)

[ghstack-poisoned]
pritamdamania87 pushed a commit that referenced this pull request Aug 19, 2020
`torch.cuda.nccl` APIs didn't throw appropriate errors when called
with inputs/outputs that were of the wrong type and it resulted in some cryptic
errors instead.

Adding some error checks with explicit error messages for these APIs.

Differential Revision: [D23206069](https://our.internmc.facebook.com/intern/diff/D23206069/)

ghstack-source-id: 110214324
Pull Request resolved: #43247
@dr-ci
Copy link

dr-ci bot commented Aug 19, 2020

💊 CI failures summary and remediations

As of commit e2f1eff (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 10 times.

def _check_sequence_type(inputs):
if not isinstance(inputs, list) and not isinstance(inputs, tuple):
raise TypeError("inputs should be a list/tuple")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use 2 line spacing here?


def _check_sequence_type(inputs):
if not isinstance(inputs, list) and not isinstance(inputs, tuple):
raise TypeError("inputs should be a list/tuple")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it also be any python iterable/list-like object? I just tried the NCCL test with inputs = iter(inputs) (which makes it a list_iterable type) and inputs = set(inputs), and the current test_all_gather seems to pass. With this PR, would we stop supporting those types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually tried this approach out first, but apparently a standalone tensor is Iterable as well:

>>> isinstance(torch.rand(10), collections.Iterable)
True

As a result, if we check for just Iterable we might include a single Tensor as a valid input. I'll actually update the check with a combination of both checks to incorporate this.

`torch.cuda.nccl` APIs didn't throw appropriate errors when called
with inputs/outputs that were of the wrong type and it resulted in some cryptic
errors instead.

Adding some error checks with explicit error messages for these APIs.

Differential Revision: [D23206069](https://our.internmc.facebook.com/intern/diff/D23206069/)

[ghstack-poisoned]
pritamdamania87 pushed a commit that referenced this pull request Aug 21, 2020
Pull Request resolved: #43247

`torch.cuda.nccl` APIs didn't throw appropriate errors when called
with inputs/outputs that were of the wrong type and it resulted in some cryptic
errors instead.

Adding some error checks with explicit error messages for these APIs.
ghstack-source-id: 110484323

Differential Revision: [D23206069](https://our.internmc.facebook.com/intern/diff/D23206069/)
Copy link
Contributor

@rohan-varma rohan-varma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

`torch.cuda.nccl` APIs didn't throw appropriate errors when called
with inputs/outputs that were of the wrong type and it resulted in some cryptic
errors instead.

Adding some error checks with explicit error messages for these APIs.

Differential Revision: [D23206069](https://our.internmc.facebook.com/intern/diff/D23206069/)

[ghstack-poisoned]
pritamdamania87 pushed a commit that referenced this pull request Aug 25, 2020
Pull Request resolved: #43247

`torch.cuda.nccl` APIs didn't throw appropriate errors when called
with inputs/outputs that were of the wrong type and it resulted in some cryptic
errors instead.

Adding some error checks with explicit error messages for these APIs.
ghstack-source-id: 110683546

Differential Revision: [D23206069](https://our.internmc.facebook.com/intern/diff/D23206069/)
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 306eb3d.

@facebook-github-bot facebook-github-bot deleted the gh/pritamdamania87/153/head branch August 30, 2020 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants