-
-
Notifications
You must be signed in to change notification settings - Fork 12.1k
[Bug] Fix DeepGEMM Env Control #23591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Fix DeepGEMM Env Control #23591
Conversation
Signed-off-by: yewentao256 <zhyanwentao@126.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a fix to ensure that the VLLM_USE_DEEP_GEMM environment variable correctly controls the usage of DeepGEMM for FP8 linear layers. The change adds a check for this environment variable in the should_use_deepgemm_for_fp8_linear function, making its behavior consistent with other DeepGEMM-powered components in vLLM. The implementation is correct and effectively addresses the described bug. My review found no issues with the proposed changes.
|
Should we put it in the base check? I'm not sure where else these functions are called |
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Fixed, thanks! |
Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: tc-mb <caitianchi@modelbest.cn>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Purpose
We don't let deepgemm env to control fp8 linear, this PR fixes the bug