-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[CI] Update NVIDIA driver to 580.82.07
#163111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163111
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 102 PendingAs of commit 5df64e3 with merge base cfc539f ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
580.82.07
| - name: Get the workflow type for the current user | ||
| id: set-condition | ||
| run: | | ||
| curr_branch="${{ inputs.curr_branch }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is just a temp change to bypass the recent issue with no-runner-experiment?
huydhn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stamped to unblocked, the PR needs to be cleaned up before landing
To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with NVIDIA/numba-cuda@6e08c9d [ghstack-poisoned]
To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with NVIDIA/numba-cuda@6e08c9d This fix was suggested in #162878 (comment) [ghstack-poisoned]
|
@pytorchbot merge -f "Lint is green, signal has been green previously" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot merge -f "Take two" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
I suppose we need to lock numba version for a while for this patch to successfully apply? [Assuming a different numba version may have slight line number changes for the driver.py file] |
To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with NVIDIA/numba-cuda@6e08c9d This fix was suggested in pytorch#162878 (comment) Pull Request resolved: pytorch#163111 Approved by: https://github.com/huydhn
This reverts commit 16475a8. Reverted pytorch#163111 on behalf of https://github.com/malfet due to It started to fail now, but worked just fine in PR CI ([comment](pytorch#163111 (comment)))
To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with NVIDIA/numba-cuda@6e08c9d This fix was suggested in pytorch#162878 (comment) Pull Request resolved: pytorch#163111 Approved by: https://github.com/huydhn
To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with NVIDIA/numba-cuda@6e08c9d This fix was suggested in pytorch#162878 (comment) Pull Request resolved: pytorch#163111 Approved by: https://github.com/huydhn
This reverts commit 16475a8. Reverted pytorch#163111 on behalf of https://github.com/malfet due to It started to fail now, but worked just fine in PR CI ([comment](pytorch#163111 (comment)))
To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with NVIDIA/numba-cuda@6e08c9d This fix was suggested in pytorch#162878 (comment) Pull Request resolved: pytorch#163111 Approved by: https://github.com/huydhn
|
@pytorchbot cherry-pick --onto release/2.9 --fixes "Critical CI fix" -c critical |
To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with NVIDIA/numba-cuda@6e08c9d This fix was suggested in #162878 (comment) Pull Request resolved: #163111 Approved by: https://github.com/huydhn (cherry picked from commit 8dbac62)
Cherry picking #163111The cherry pick PR is at #163522 and it is linked with issue Critical CI fix. The following tracker issues are updated: Details for Dev Infra teamRaised by workflow job |
[CI] Update NVIDIA driver to `580.82.07` (#163111) To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with NVIDIA/numba-cuda@6e08c9d This fix was suggested in #162878 (comment) Pull Request resolved: #163111 Approved by: https://github.com/huydhn (cherry picked from commit 8dbac62) Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with NVIDIA/numba-cuda@6e08c9d This fix was suggested in pytorch#162878 (comment) Pull Request resolved: pytorch#163111 Approved by: https://github.com/huydhn
This reverts commit 16475a8. Reverted pytorch#163111 on behalf of https://github.com/malfet due to It started to fail now, but worked just fine in PR CI ([comment](pytorch#163111 (comment)))
To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with NVIDIA/numba-cuda@6e08c9d This fix was suggested in pytorch#162878 (comment) Pull Request resolved: pytorch#163111 Approved by: https://github.com/huydhn
The live patch for numba.cuda introduced in pytorch#163111 causes issues in ROCm CI jobs, which do not use CUDA. This change restricts the patching logic to only run when $BUILD_ENVIRONMENT contains 'cuda'.
The patch introduced in pytorch#163111 causes issues in ROCm CI jobs. This change restricts the patching logic to CUDA environments only.
The patch introduced in pytorch#163111 caused issues in ROCm environments. This change guards the patching logic to CUDA environments only, thus alleviating ROCm builds.
The patch introduced in #163111 caused issues in ROCm environments. This change guards the patching logic to CUDA environments only, thus ameliorating test failures in ROCm environments. Pull Request resolved: #164607 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>
) The patch introduced in pytorch#163111 caused issues in ROCm environments. This change guards the patching logic to CUDA environments only, thus ameliorating test failures in ROCm environments. Pull Request resolved: pytorch#164607 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>
) The patch introduced in pytorch#163111 caused issues in ROCm environments. This change guards the patching logic to CUDA environments only, thus ameliorating test failures in ROCm environments. Pull Request resolved: pytorch#164607 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>
Stack from ghstack (oldest at bottom):
580.82.07#163111To make CI machines capable of running CUDA-13 tests. Unfortunately, this upgrade regresses NUMBA integration, so live patch it with NVIDIA/numba-cuda@6e08c9d
This fix was suggested in #162878 (comment)