-
Notifications
You must be signed in to change notification settings - Fork 2k
[TRTLLM-8734][feat] AutoDeploy: Enable the nvfp4 for Nemotron MOE #8737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TRTLLM-8734][feat] AutoDeploy: Enable the nvfp4 for Nemotron MOE #8737
Conversation
Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com>
📝 WalkthroughWalkthroughThis PR modifies the shape handling in NVFP4QuantizationFromConfig.load_hook to flatten weight_scale tensors to 1-D vectors after interleaving, removing the logic that previously restored their original multi-dimensional shape. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/bot run |
|
PR_Github #22803 [ run ] triggered by Bot. Commit: |
|
PR_Github #22803 [ run ] completed with state |
|
/bot run |
|
PR_Github #22808 [ run ] triggered by Bot. Commit: |
|
PR_Github #22808 [ run ] completed with state |
|
/bot run |
|
PR_Github #22855 [ run ] triggered by Bot. Commit: |
|
PR_Github #22855 [ run ] completed with state |
|
/bot run |
|
PR_Github #22918 [ run ] triggered by Bot. Commit: |
|
PR_Github #22918 [ run ] completed with state |
|
/bot run |
|
PR_Github #22932 [ run ] triggered by Bot. Commit: |
|
PR_Github #22932 [ run ] completed with state |
|
/bot run |
|
PR_Github #22934 [ run ] triggered by Bot. Commit: |
|
PR_Github #22934 [ run ] completed with state |
|
/bot run |
|
PR_Github #22945 [ run ] triggered by Bot. Commit: |
|
PR_Github #22945 [ run ] completed with state |
|
/bot run |
|
PR_Github #22982 [ run ] triggered by Bot. Commit: |
|
PR_Github #22982 [ run ] completed with state |
|
/bot run |
|
PR_Github #23073 [ run ] triggered by Bot. Commit: |
|
PR_Github #23073 [ run ] completed with state |
…IDIA#8737) Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com> Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com> Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Remove the reshape to the original shape logic since the custom op will so the padding.
Summary by CodeRabbit