[cuBLAS] update cuBLAS determinism docs, remove workspace requirement checks#161749
[cuBLAS] update cuBLAS determinism docs, remove workspace requirement checks#161749eqy wants to merge 4 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161749
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 896198b with merge base ac7b4e7 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
docs/source/notes/randomness.rst
Outdated
| [ 0.0333, -1.1444]]], device='cuda:0') | ||
|
|
||
| Furthermore, if you are using CUDA tensors, and your CUDA version is 10.2 or greater, you | ||
| Furthermore, if you are using CUDA tensors, and your CUDA version is between 10.2 and 11.0 you |
There was a problem hiding this comment.
no longer supported, just remove this sentence
|
Can this one be landed? |
|
Sure, let's see CI signal after removing the old determinisic alert test |
|
Now there's a dtype error in _scaled_mm that shouldn't be related? |
|
H100 |
|
ciflow/H100 is still run on trunk (see on HUD), if it doesn't report existing failures that's a problem (and looks like it doesn't). |
3f484ec to
896198b
Compare
|
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
… checks (pytorch#161749) Since CUDA 11.x (need to update the docs for this, current PR is saying 12.2 which is incorrect) we've been allocating cuBLAS workspaces explicitly per handle/stream combination pytorch#85447 According to the cuBLAS documentation, this appears to be sufficient for determinism without any explicit workspace requirements to e.g., `:4096:8` or `:16:8` as was previously expressed in PyTorch docs https://docs.nvidia.com/cuda/cublas/#results-reproducibility Planning to add an explicit determinism test as well... Pull Request resolved: pytorch#161749 Approved by: https://github.com/ngimel
Since CUDA 11.x (need to update the docs for this, current PR is saying 12.2 which is incorrect) we've been allocating cuBLAS workspaces explicitly per handle/stream combination #85447
According to the cuBLAS documentation, this appears to be sufficient for determinism without any explicit workspace requirements to e.g.,
:4096:8or:16:8as was previously expressed in PyTorch docs https://docs.nvidia.com/cuda/cublas/#results-reproducibilityPlanning to add an explicit determinism test as well...
cc @ptrblck @msaroufim @jerryzh168 @csarofeen @xwang233 @mruberry @kurtamohler