Skip to content

[CI] Add CUDA 13.0 inductor CI benchmarks#165029

Closed
tinglvv wants to merge 31 commits intopytorch:mainfrom
tinglvv:cu13-ci-test
Closed

[CI] Add CUDA 13.0 inductor CI benchmarks#165029
tinglvv wants to merge 31 commits intopytorch:mainfrom
tinglvv:cu13-ci-test

Conversation

@tinglvv
Copy link
Collaborator

@tinglvv tinglvv commented Oct 9, 2025

Adding CUDA 13.0 to the inductor bencharks as it is the latest support CUDA version

@tinglvv tinglvv requested review from a team and jeffdaily as code owners October 9, 2025 03:58
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 9, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/165029

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 11 Unrelated Failures

As of commit 9537f04 with merge base 44ac693 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Oct 9, 2025
@bdhirsh bdhirsh requested a review from eellison October 10, 2025 13:47
@bdhirsh bdhirsh added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 10, 2025
@lakshayg
Copy link
Collaborator

lakshayg commented Oct 13, 2025

@tinglvv You can try this for cuSPARSE deprecation warnings. I have not tested it so might not work but something like this should resolve the issue.

diff --git a/cmake/Modules/FindCUDAToolkit.cmake b/cmake/Modules/FindCUDAToolkit.cmake
index ec9ae530aa..0e0460451a 100644
--- a/cmake/Modules/FindCUDAToolkit.cmake
+++ b/cmake/Modules/FindCUDAToolkit.cmake
@@ -905,6 +905,7 @@ if(CUDAToolkit_FOUND)
         endif()
       endif()
       set_property(TARGET CUDA::${lib_name} PROPERTY IMPORTED_LOCATION "${CUDA_${lib_name}_LIBRARY}")
+      set_property(TARGET CUDA::${lib_name} PROPERTY SYSTEM TRUE)
       foreach(dep ${arg_DEPS})
         if(TARGET CUDA::${dep})
           set_property(TARGET CUDA::${lib_name} APPEND PROPERTY

@eellison eellison requested a review from seemethere October 13, 2025 22:30
@tinglvv
Copy link
Collaborator Author

tinglvv commented Oct 17, 2025

From cusparse doc, https://docs.nvidia.com/cuda/pdf/CUSPARSE_Library.pdf, suggested passing
-DDISABLE_CUSPARSE_DEPRECATED to the compiler to suppress deprecation warnings. Trying.

@tinglvv
Copy link
Collaborator Author

tinglvv commented Oct 20, 2025

Previous flag worked in resolving the deprecation warning. New build error

2025-10-17T22:16:31.7451807Z   �[31m   �[0m [71/411] Building CUDA object CMakeFiles/fbgemm_gpu_py.dir/src/memory_utils/memory_utils.cu.o
2025-10-17T22:16:31.7452879Z   �[31m   �[0m FAILED: CMakeFiles/fbgemm_gpu_py.dir/src/memory_utils/memory_utils.cu.o
2025-10-17T22:16:31.7461457Z   �[31m   �[0m /usr/local/cuda-13.0/bin/nvcc -forward-unknown-to-host-compiler -DUSE_C10D_GLOO -DUSE_C10D_MPI -DUSE_C10D_NCCL -DUSE_DISTRIBUTED -DUSE_NVSHMEM -DUSE_RPC -DUSE_TENSORPIPE -Dfbgemm_gpu_py_EXPORTS -I/tmp/pip-req-build-4xl0yqmj/fbgemm_gpu -I/tmp/pip-req-build-4xl0yqmj/fbgemm_gpu/include -I/tmp/pip-req-build-4xl0yqmj/fbgemm_gpu/../include -I/tmp/pip-req-build-4xl0yqmj/fbgemm_gpu/../third_party/asmjit/src -I/tmp/pip-req-build-4xl0yqmj/fbgemm_gpu/../third_party/cpuinfo/include -isystem /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -isystem /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /usr/local/cuda-13.0/include -DONNX_NAMESPACE=onnx_c2 -gencode arch=compute_86,code=sm_86 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda -O3 -DNDEBUG -std=c++17 -Xcompiler=-fPIC --expt-relaxed-constexpr -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -MD -MT CMakeFiles/fbgemm_gpu_py.dir/src/memory_utils/memory_utils.cu.o -MF CMakeFiles/fbgemm_gpu_py.dir/src/memory_utils/memory_utils.cu.o.d -x cu -c /tmp/pip-req-build-4xl0yqmj/fbgemm_gpu/src/memory_utils/memory_utils.cu -o CMakeFiles/fbgemm_gpu_py.dir/src/memory_utils/memory_utils.cu.o
2025-10-17T22:16:31.7469595Z   �[31m   �[0m /tmp/pip-req-build-4xl0yqmj/fbgemm_gpu/src/memory_utils/memory_utils.cu(165): error: no suitable constructor exists to convert from "int" to "cudaMemLocation"
2025-10-17T22:16:31.7470516Z   �[31m   �[0m    ((int)-1)

@tinglvv
Copy link
Collaborator Author

tinglvv commented Oct 20, 2025

Disabled fbgemm for CUDA 13 inductor build due to #165029 (comment). CUDA 13 support for FBGEMM is WIP pytorch/FBGEMM#4783

@tinglvv tinglvv changed the title Add 13.0 inductor benchmarks Add CUDA 13.0 inductor benchmarks Oct 20, 2025
@eellison eellison requested a review from desertfire October 27, 2025 15:45
@tinglvv
Copy link
Collaborator Author

tinglvv commented Oct 27, 2025

Further debug shows that the job calls into install_torchrec_and_fbgemm function from

install_torchrec_and_fbgemm
, based on log "Building wheels for collected packages: torchrec". Therefore need to disable it from test file as well, not just build.sh.

@tinglvv
Copy link
Collaborator Author

tinglvv commented Oct 28, 2025

Should not disable torchrec since it will lead to failure in:
inductor / inductor-test-cuda13 / test (inductor_torchbench, 1, 2, linux.g5.4xlarge.nvidia.gpu) (gh)
torchrec_dlrm

previous fbgemm failure is gone now.

@tinglvv
Copy link
Collaborator Author

tinglvv commented Oct 28, 2025

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased cu13-ci-test onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout cu13-ci-test && git pull --rebase)

@tinglvv
Copy link
Collaborator Author

tinglvv commented Oct 28, 2025

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Tried to rebase and push PR #165029, but it was already up to date. Try rebasing against main by issuing:
@pytorchbot rebase -b main

@tinglvv
Copy link
Collaborator Author

tinglvv commented Oct 28, 2025

@pytorchbot rebase -b main

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased cu13-ci-test onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout cu13-ci-test && git pull --rebase)

@tinglvv tinglvv mentioned this pull request Dec 3, 2025
2 tasks
@tinglvv
Copy link
Collaborator Author

tinglvv commented Dec 4, 2025

@pytorchbot merge -i "failures are not related"

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 4, 2025

❌ 🤖 pytorchbot command failed:

@pytorchbot: error: unrecognized arguments: failures are not related

usage: @pytorchbot [-h] {merge,revert,rebase,label,drci,cherry-pick} ...

Try @pytorchbot --help for more info.

@tinglvv
Copy link
Collaborator Author

tinglvv commented Dec 4, 2025

@pytorchbot merge -i

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged while ignoring the following 14 checks: Build manywheel docker images / manylinuxaarch64-builder:cuda13.0, pull / linux-jammy-py3.14-clang12 / test (default, 5, 5, linux.4xlarge), pull / linux-jammy-py3.14-clang12 / test (dynamo_wrapped, 3, 3, linux.2xlarge), trunk / linux-jammy-rocm-py3.10 / test (default, 2, 6, linux.rocm.gpu.gfx942.1, unstable), inductor-periodic / rocm-periodic-dynamo-benchmarks-test / test (dynamic_inductor_timm, 1, 2, linux.rocm.gpu.gfx942.1), inductor-periodic / periodic-dynamo-benchmarks-test-cuda13 / test (dynamo_eager_huggingface, 1, 1, linux.g5.4xlarge.nvidia.gpu), inductor-periodic / periodic-dynamo-benchmarks-test-cuda13 / test (aot_eager_huggingface, 1, 1, linux.g5.4xlarge.nvidia.gpu), inductor-periodic / periodic-dynamo-benchmarks-test-cuda13 / test (dynamic_inductor_huggingface, 1, 1, linux.g5.4xlarge.nvidia.gpu), inductor-periodic / periodic-dynamo-benchmarks-test-cuda13 / test (dynamic_aot_eager_huggingface, 1, 1, linux.g5.4xlarge.nvidia.gpu), inductor-periodic / inductor-smoke-test / test (inductor_torchbench_smoketest_perf, 1, 1, linux.aws.a100, unstable), inductor-periodic / periodic-dynamo-benchmarks-test / test (dynamo_eager_huggingface, 1, 1, linux.g5.4xlarge.nvidia.gpu), inductor-periodic / periodic-dynamo-benchmarks-test / test (aot_eager_huggingface, 1, 1, linux.g5.4xlarge.nvidia.gpu), inductor-periodic / periodic-dynamo-benchmarks-test / test (dynamic_aot_eager_huggingface, 1, 1, linux.g5.4xlarge.nvidia.gpu), inductor-periodic / periodic-dynamo-benchmarks-test / test (dynamic_inductor_huggingface, 1, 1, linux.g5.4xlarge.nvidia.gpu)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

umechand-amd pushed a commit to ROCm/pytorch that referenced this pull request Dec 8, 2025
Adding CUDA 13.0 to the inductor bencharks as it is the latest support CUDA version
Pull Request resolved: pytorch#165029
Approved by: https://github.com/atalman
JacobSzwejbka pushed a commit that referenced this pull request Dec 8, 2025
Adding CUDA 13.0 to the inductor bencharks as it is the latest support CUDA version
Pull Request resolved: #165029
Approved by: https://github.com/atalman
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/inductor ciflow/inductor-perf-compare ciflow/inductor-perf-test-nightly Trigger nightly inductor perf tests ciflow/inductor-periodic ciflow/trunk Trigger trunk jobs on your pull request keep-going Don't stop on first failure, keep running tests until the end Merged open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants