Fix nvfp4 swizzling #23140

yiliu30 · 2025-08-19T02:00:09Z

@yewentao256 @mgoin could you help to review again, thx!

Signed-off-by: yiliu30 <yi4.liu@intel.com>

gemini-code-assist

Code Review

This pull request correctly fixes a critical bug in the nvfp4 swizzling logic. The issue was that after padding a tensor, the original dimensions were used for reshaping, which would cause a runtime error. The fix correctly uses the padded dimensions. The same fix is applied in two different files where the swizzling logic is duplicated. My review includes a comment pointing out this code duplication and suggesting a refactor to improve maintainability and prevent similar bugs in the future.

...del_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w4a4_nvfp4.py

github-actions · 2025-08-19T02:08:17Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yewentao256

LGTM, thanks for the work!

Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: Duncan Moss <djm.moss@gmail.com>

Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: root <xwq391974@alibaba-inc.com>

Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>

Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>

fix padding for block scale swizzling

3f4805a

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 requested review from mgoin, robertgshaw2-redhat, tlrmchlsmth and yewentao256 as code owners August 19, 2025 02:00

gemini-code-assist bot reviewed Aug 19, 2025

View reviewed changes

...del_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w4a4_nvfp4.py Outdated Show resolved Hide resolved

use same func

481cee2

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yewentao256 approved these changes Aug 21, 2025

View reviewed changes

yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 21, 2025

Merge branch 'main' into fix-swizzling-padding

4cde09e

yewentao256 enabled auto-merge (squash) August 21, 2025 14:45

yewentao256 merged commit 0278f1a into vllm-project:main Aug 21, 2025
45 checks passed

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

Fix nvfp4 swizzling (vllm-project#23140)

3248cc9

Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

Fix nvfp4 swizzling (vllm-project#23140)

d14b0d3

Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>

mengxingkongzhouhan pushed a commit to mengxingkongzhouhan/vllm that referenced this pull request Aug 30, 2025

Fix nvfp4 swizzling (vllm-project#23140)

4368091

Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Sep 3, 2025

Fix nvfp4 swizzling (vllm-project#23140)

e0d2685

Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

Fix nvfp4 swizzling (vllm-project#23140)

61c6f2d

Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix nvfp4 swizzling #23140

Fix nvfp4 swizzling #23140

Uh oh!

yiliu30 commented Aug 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

yewentao256 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Fix nvfp4 swizzling #23140

Fix nvfp4 swizzling #23140

Uh oh!

Conversation

yiliu30 commented Aug 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yiliu30 commented Aug 19, 2025 •

edited by github-actions bot

Loading