Skip to content

Conversation

@nvchenghaoz
Copy link
Collaborator

@nvchenghaoz nvchenghaoz commented Oct 28, 2025

Remove the reshape to the original shape logic since the custom op will so the padding.

Summary by CodeRabbit

  • Bug Fixes
    • Improved quantized model checkpoint loading by optimizing weight tensor shape handling during the interleaving process.

Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 28, 2025

📝 Walkthrough

Walkthrough

This PR modifies the shape handling in NVFP4QuantizationFromConfig.load_hook to flatten weight_scale tensors to 1-D vectors after interleaving, removing the logic that previously restored their original multi-dimensional shape.

Changes

Cohort / File(s) Summary
Quantization weight_scale reshaping
tensorrt_llm/_torch/auto_deploy/transform/library/quantization.py
Removed ori_shape storage and restoration logic; interleaved weight_scale tensors are now flattened to 1-D vectors via .reshape(-1) instead of being reshaped back to their original multi-dimensional layout

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Single method modification with straightforward shape transformation logic
  • Key consideration: Verify that downstream consumers of state_dict properly handle the flattened 1-D weight_scale tensors and that this shape change doesn't break compatibility with model loading/inference pipelines

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description Check ⚠️ Warning The pull request description is severely incomplete and fails to meet the template requirements. The author provided only a single sentence: "Remove the reshape to the original shape logic since the custom op will so the padding." The required sections are almost entirely missing: there is no expanded Description section explaining the issue and solution, no Test Coverage section identifying relevant tests, and no PR Checklist verification. The single sentence provided is also vague and contains unclear phrasing (e.g., "the custom op will so the padding" appears to be incomplete or has a typo). While the PR objectives indicate a proper title exists in the actual PR, the description content itself provides insufficient detail to understand the changes, their rationale, or their testing. The author should expand the pull request description to include all required template sections. A complete description should explain what changes are being made (removing the reshape to original shape logic), why this change is necessary (clarify how the custom op handles padding and why the previous logic is no longer needed), provide a clear Test Coverage section listing which tests validate this change, and complete the PR Checklist. The description should be more detailed than a single sentence to give reviewers sufficient context for evaluation.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "[TRTLLM-8734][feat] AutoDeploy: Enable the nvfp4 for Nemotron MOE" follows the correct format specified in the repository template, with a valid JIRA ticket identifier, feature type designation, and clear summary statement. The title accurately relates to the changeset, which involves modifications to NVFP4 quantization functionality (removing ori_shape storage and reshaping weight_scale to 1-D) in the quantization library. Although the specific change shown is a shape-oriented adjustment, it is a real aspect of the broader nvfp4 quantization feature for Nemotron MOE support.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nvchenghaoz
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22803 [ run ] triggered by Bot. Commit: d2ec6a4

@Fridah-nv Fridah-nv requested a review from meenchen October 28, 2025 20:34
@tensorrt-cicd
Copy link
Collaborator

PR_Github #22803 [ run ] completed with state SUCCESS. Commit: d2ec6a4
/LLM/main/L0_MergeRequest_PR pipeline #17198 completed with status: 'FAILURE'

@github-project-automation github-project-automation bot moved this from Backlog to In review in AutoDeploy Board Oct 28, 2025
@suyoggupta
Copy link
Collaborator

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22808 [ run ] triggered by Bot. Commit: a0e0de3

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22808 [ run ] completed with state SUCCESS. Commit: a0e0de3
/LLM/main/L0_MergeRequest_PR pipeline #17203 completed with status: 'FAILURE'

@suyoggupta
Copy link
Collaborator

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22855 [ run ] triggered by Bot. Commit: eebdb03

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22855 [ run ] completed with state SUCCESS. Commit: eebdb03
/LLM/main/L0_MergeRequest_PR pipeline #17239 completed with status: 'FAILURE'

@nvchenghaoz
Copy link
Collaborator Author

/bot run

@nvchenghaoz nvchenghaoz enabled auto-merge (squash) October 29, 2025 17:10
@tensorrt-cicd
Copy link
Collaborator

PR_Github #22918 [ run ] triggered by Bot. Commit: eebdb03

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22918 [ run ] completed with state SUCCESS. Commit: eebdb03
/LLM/main/L0_MergeRequest_PR pipeline #17286 completed with status: 'FAILURE'

@nvchenghaoz
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22932 [ run ] triggered by Bot. Commit: eebdb03

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22932 [ run ] completed with state SUCCESS. Commit: eebdb03
/LLM/main/L0_MergeRequest_PR pipeline #17293 completed with status: 'FAILURE'

@nvchenghaoz
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22934 [ run ] triggered by Bot. Commit: eebdb03

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22934 [ run ] completed with state SUCCESS. Commit: eebdb03
/LLM/main/L0_MergeRequest_PR pipeline #17296 completed with status: 'FAILURE'

@nvchenghaoz
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22945 [ run ] triggered by Bot. Commit: 892e3a4

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22945 [ run ] completed with state SUCCESS. Commit: 892e3a4
/LLM/main/L0_MergeRequest_PR pipeline #17302 completed with status: 'FAILURE'

@nvchenghaoz
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22982 [ run ] triggered by Bot. Commit: 892e3a4

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22982 [ run ] completed with state SUCCESS. Commit: 892e3a4
/LLM/main/L0_MergeRequest_PR pipeline #17325 completed with status: 'FAILURE'

@nvchenghaoz
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #23073 [ run ] triggered by Bot. Commit: 1304330

@tensorrt-cicd
Copy link
Collaborator

PR_Github #23073 [ run ] completed with state SUCCESS. Commit: 1304330
/LLM/main/L0_MergeRequest_PR pipeline #17399 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

@nvchenghaoz nvchenghaoz merged commit 71c5576 into NVIDIA:main Oct 30, 2025
5 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in AutoDeploy Board Oct 30, 2025
fredricz-20070104 pushed a commit to fredricz-20070104/TensorRT-LLM that referenced this pull request Nov 5, 2025
…IDIA#8737)

Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants