feat(benchmarks): Add torch profiler support via ENABLE_PROFILE environment variable#611
feat(benchmarks): Add torch profiler support via ENABLE_PROFILE environment variable#611haofrank wants to merge 1 commit intoInferenceMAX:mainfrom
ENABLE_PROFILE environment variable#611Conversation
ENABLE_PROFILE environment variable
|
@haofrank thanks! any chance you or ur ai assistant can add the abiltty for only a couple of fwd passes. doing the whole time would create an large trace |
|
@Oseltamivir was working on some of this too with outputting to github action artfacts or side repo and even have an perfetto relay for inferencemax (tho put on hold to work on evals) |
|
Thanks @haofrank, we already have a profiling branch at However, it is currently stale as only hopper had usable information (kernel input dims, shapes, etc). Nevertheless, having the ability to run the profiler may be useful. We will consider merging the profiler choice with the perfetto relay into |
|
@haofrank If you would like to contribute to the repo, we would appreciate it if you could help update the branch at |
|
Hi @Oseltamivir, thanks for the note! I’d be happy to help with this. From a quick look, it seems the profiling branch already supports enabling profiling via the I’ll need a bit of time to go through and understand the intent. If I’m not able to make meaningful progress in time, it probably makes sense for you to proceed it. |
|
Hi @haofrank, yep, that branch is very different in its intentions. If you could help:
We value open source contributions and can definitly merge it in. The profiling branch works so you can assume env vars/relays are working. However, I agree it might be difficult as there are quite a few parts involved. Let me know what you think, I'll work on this next week Wednesday if you're unable to make progress |
Summary
Add torch profiler support to benchmark scripts, enabling collection of detailed GPU performance data during benchmark runs.
Resolves #610
Changes
benchmarks/benchmark_lib.shENABLE_PROFILE=trueVLLM_TORCH_PROFILER_DIRto/workspace/profilingif not explicitly set--profileflag tobenchmark_serving.pyinvocation inrun_benchmark_serving()Runner Scripts
Added
ENABLE_PROFILEandVLLM_TORCH_PROFILER_DIRenvironment variable passthrough to docker containers:runners/launch_b200-dgxc.shrunners/launch_h100-cr.shrunners/launch_mi300x-amd.shrunners/launch_mi300x-cr.shUsage
Enable profiling with default output directory (/workspace/profiling)
ENABLE_PROFILE=true ./benchmarks/dsr1_fp8_b200.shEnable profiling with custom output directory
ENABLE_PROFILE=true VLLM_TORCH_PROFILER_DIR=/custom/path ./benchmarks/dsr1_fp8_b200.shHow It Works
benchmark_lib.shis sourced, it checksENABLE_PROFILEVLLM_TORCH_PROFILER_DIR(server uses this to enable/start_profileand/stop_profileendpoints)benchmark_serving.pycalls these endpoints to collect torch profiling dataVLLM_TORCH_PROFILER_DIRTesting
ENABLE_PROFILE=trueon AMD GPU node