[Misc] Removed force_fp8_e4m3fnuz from FP8LinearOp #23725

nvjullin · 2025-08-27T09:06:41Z

Purpose

Follow up on #22895.
Removed force_fp8_e4m3fnuz and monkey-patch to test for torch code path on cutlass_fp8_supported platforms.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Julien Lin <jullin@nvidia.com>

gemini-code-assist

Code Review

This pull request refactors the Fp8LinearOp to remove the force_fp8_e4m3fnuz parameter, simplifying its interface. The logic to force a specific backend for testing is now handled in the tests themselves using a new override_cutlass_fp8_supported context manager, which is a good improvement. However, I've found a potential issue in the test parametrization for cases where cutlass is not supported, which could leave a code path untested. My review includes suggestions to fix this.

tests/compile/test_fusion.py

tests/compile/test_silu_mul_quant_fusion.py

ProExpertProg

LGTM, thanks for this follow up. Could you just add quick comments in the tests that we do this in order to test fusion for the non-cutlass path on cutlass platform?

Signed-off-by: Julien Lin <jullin@nvidia.com>

mergify · 2025-08-30T04:47:05Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @nvjullin.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

tests/compile/test_fusion.py

tests/compile/test_silu_mul_quant_fusion.py

ProExpertProg · 2025-09-02T18:46:43Z

@nvjullin it looks like the test failure is related, trying a fix

Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

Signed-off-by: Julien Lin <jullin@nvidia.com>

Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

removed force_fp8_e4m3fnuz

f190ee1

Signed-off-by: Julien Lin <jullin@nvidia.com>

nvjullin requested review from mgoin, robertgshaw2-redhat, tlrmchlsmth and yewentao256 as code owners August 27, 2025 09:06

gemini-code-assist bot reviewed Aug 27, 2025

View reviewed changes

tests/compile/test_fusion.py Outdated Show resolved Hide resolved

tests/compile/test_silu_mul_quant_fusion.py Outdated Show resolved Hide resolved

nvjullin mentioned this pull request Aug 27, 2025

[Kernel] Added flashinfer fp8 per-tensor gemms #22895

Merged

4 tasks

ProExpertProg approved these changes Aug 27, 2025

View reviewed changes

fixed test parameter and added comment

30c5069

Signed-off-by: Julien Lin <jullin@nvidia.com>

mergify bot added the needs-rebase label Aug 30, 2025

Merge branch 'main' into flashinfer-fp8-gemms

53c70ab

mergify bot removed the needs-rebase label Sep 2, 2025

ProExpertProg approved these changes Sep 2, 2025

View reviewed changes

ProExpertProg enabled auto-merge (squash) September 2, 2025 12:49

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 2, 2025

ProExpertProg reviewed Sep 2, 2025

View reviewed changes

tests/compile/test_fusion.py Outdated Show resolved Hide resolved

ProExpertProg reviewed Sep 2, 2025

View reviewed changes

tests/compile/test_silu_mul_quant_fusion.py Outdated Show resolved Hide resolved

ProExpertProg and others added 4 commits September 2, 2025 14:46

Update tests/compile/test_silu_mul_quant_fusion.py

c735c6b

Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

relative utils import

a6aaac4

Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

Merge branch 'main' into flashinfer-fp8-gemms

f223100

fixed pre-commit

e6e4daf

Signed-off-by: Julien Lin <jullin@nvidia.com>

auto-merge was automatically disabled September 3, 2025 09:00
Head branch was pushed to by a user without write access

Merge branch 'main' into flashinfer-fp8-gemms

d4c6301

ProExpertProg merged commit 3724107 into vllm-project:main Sep 4, 2025
44 checks passed

elvischenv mentioned this pull request Sep 5, 2025

[Bugfix] Fix silu_mul+quant fusion test #24341

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] Removed force_fp8_e4m3fnuz from FP8LinearOp #23725

[Misc] Removed force_fp8_e4m3fnuz from FP8LinearOp #23725

Uh oh!

nvjullin commented Aug 27, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

ProExpertProg left a comment

Uh oh!

mergify bot commented Aug 30, 2025

Uh oh!

Uh oh!

Uh oh!

ProExpertProg commented Sep 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Misc] Removed force_fp8_e4m3fnuz from FP8LinearOp #23725

[Misc] Removed force_fp8_e4m3fnuz from FP8LinearOp #23725

Uh oh!

Conversation

nvjullin commented Aug 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Aug 30, 2025

Uh oh!

Uh oh!

Uh oh!

ProExpertProg commented Sep 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nvjullin commented Aug 27, 2025 •

edited by github-actions bot

Loading