-
-
Notifications
You must be signed in to change notification settings - Fork 12.1k
[Misc] Removed force_fp8_e4m3fnuz from FP8LinearOp #23725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Misc] Removed force_fp8_e4m3fnuz from FP8LinearOp #23725
Conversation
Signed-off-by: Julien Lin <jullin@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the Fp8LinearOp to remove the force_fp8_e4m3fnuz parameter, simplifying its interface. The logic to force a specific backend for testing is now handled in the tests themselves using a new override_cutlass_fp8_supported context manager, which is a good improvement. However, I've found a potential issue in the test parametrization for cases where cutlass is not supported, which could leave a code path untested. My review includes suggestions to fix this.
ProExpertProg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for this follow up. Could you just add quick comments in the tests that we do this in order to test fusion for the non-cutlass path on cutlass platform?
Signed-off-by: Julien Lin <jullin@nvidia.com>
|
This pull request has merge conflicts that must be resolved before it can be |
|
@nvjullin it looks like the test failure is related, trying a fix |
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Julien Lin <jullin@nvidia.com>
Head branch was pushed to by a user without write access
Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: Julien Lin <jullin@nvidia.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Purpose
Follow up on #22895.
Removed
force_fp8_e4m3fnuzand monkey-patch to test for torch code path on cutlass_fp8_supported platforms.Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.