-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Fix broadcast copying device[0] tensor when not using NCCL #8222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
torch/csrc/utils/tensor_flatten.h
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/csrc/utils/tensor_flatten.h
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@pytorchbot retest this please |
|
can I get a review on this please? |
|
@apaszke can you take a look at this when you have time? |
torch/csrc/cuda/comm.cpp
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/csrc/cuda/comm.cpp
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
…tential extra copy in flatten_dense_tensors
|
Let me know how this looks to you. @apaszke |
|
I notice there was no test added for this |
|
adding one soon @ezyang |
flatten_dense_tensors