[Perf] Tune configs for triton block fp8 gemm H100/H200 #23748

mgoin · 2025-08-27T13:42:27Z

Purpose

Retune the triton fp8 block dense gemm configs for modern triton. Also adds a simple benchmark script that doesn't tune.

Mostly improves performance for smaller M, but crucially gives improvement for N=576,K=7168

Test Plan

Test Result

H100

H200

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: mgoin <mgoin64@gmail.com>

gemini-code-assist

Code Review

This pull request introduces performance tuning configurations for the FP8 block-wise GEMM kernel on H100/H200 GPUs, specifically for matrix shapes found in the DeepSeek-V3 model. It includes a new benchmark script to compare the performance of the w8a8-block-fp8 kernel against the standard bfloat16 GEMM, along with numerous new and updated JSON files containing the tuned kernel parameters. The changes appear to be correct and well-aligned with the goal of improving performance. The new benchmark script is well-implemented, and the configuration files are consistent with the output of a tuning process. I have not identified any issues of high or critical severity.

Signed-off-by: mgoin <mgoin64@gmail.com>

yewentao256

LGTM, thanks for the work!

…#23748) Signed-off-by: mgoin <mgoin64@gmail.com>

Tune configs for triton block fp8 gemm H100/H200

aa1551f

Signed-off-by: mgoin <mgoin64@gmail.com>

mgoin requested review from robertgshaw2-redhat, tlrmchlsmth and yewentao256 as code owners August 27, 2025 13:42

mergify bot added the performance Performance-related issues label Aug 27, 2025

gemini-code-assist bot reviewed Aug 27, 2025

View reviewed changes

Add H200

ac2d5f9

Signed-off-by: mgoin <mgoin64@gmail.com>

yewentao256 approved these changes Aug 27, 2025

View reviewed changes

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 27, 2025

mgoin changed the title ~~Tune configs for triton block fp8 gemm H100/H200~~ [Perf] Tune configs for triton block fp8 gemm H100/H200 Aug 27, 2025

DarkLight1337 merged commit a781e84 into vllm-project:main Aug 28, 2025
56 checks passed

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

[Perf] Tune configs for triton block fp8 gemm H100/H200 (vllm-project…

df7d16e

…#23748) Signed-off-by: mgoin <mgoin64@gmail.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Sep 3, 2025

[Perf] Tune configs for triton block fp8 gemm H100/H200 (vllm-project…

163167d

…#23748) Signed-off-by: mgoin <mgoin64@gmail.com>

eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025

[Perf] Tune configs for triton block fp8 gemm H100/H200 (vllm-project…

1817a73

…#23748) Signed-off-by: mgoin <mgoin64@gmail.com>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Perf] Tune configs for triton block fp8 gemm H100/H200 (vllm-project…

b748e9b

…#23748) Signed-off-by: mgoin <mgoin64@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Perf] Tune configs for triton block fp8 gemm H100/H200 #23748

[Perf] Tune configs for triton block fp8 gemm H100/H200 #23748

Uh oh!

mgoin commented Aug 27, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

yewentao256 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Perf] Tune configs for triton block fp8 gemm H100/H200 #23748

[Perf] Tune configs for triton block fp8 gemm H100/H200 #23748

Uh oh!

Conversation

mgoin commented Aug 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

H100

H200

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mgoin commented Aug 27, 2025 •

edited by github-actions bot

Loading