-
-
Notifications
You must be signed in to change notification settings - Fork 12.1k
Add LoRA support for DeepSeek models (V2, V3, R1-0528) #23971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LoRA support for DeepSeek models (V2, V3, R1-0528) #23971
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds LoRA support for DeepSeek models by inheriting from the SupportsLoRA interface. While this is a good first step, it's incomplete. For LoRA to work on token embeddings and the language model head, the embedding_modules and embedding_padding_modules attributes must be defined on the DeepseekV2ForCausalLM class. Without these, LoRA for embeddings will not be enabled. I've added a suggestion to include these missing attributes.
8ab9751 to
9fbef38
Compare
- Add SupportsLoRA interface to DeepseekV2ForCausalLM - Enable LoRA fine-tuning for DeepSeek-V2, DeepSeek-V3, and DeepSeek-R1-0528 - Add embedding_modules and embedding_padding_modules for complete LoRA support - MoE detection already handled by existing is_moe_model() function - Maintains compatibility with existing FusedMoE warning system This change follows the same pattern as Qwen MoE LoRA support (PR vllm-project#20932). Addresses code review feedback by including proper embedding layer configuration. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: sadeghja1070 <sadegh.ja1070@gmail.com>
9fbef38 to
faf5309
Compare
jeejeelee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you , we also need to update the model info in the doc
Would I need to do this or would you take care of it? |
If it's convenient for you, please handle it, thank you |
Update supported_models.md to reflect LoRA support for: - DeepseekForCausalLM (DeepSeek) - DeepseekV2ForCausalLM (DeepSeek-V2) - DeepseekV3ForCausalLM (DeepSeek-V3, includes R1 variants) Also added DeepSeek-R1 as an example model for DeepSeek-V3 architecture. This documentation update corresponds to the LoRA implementation added in the previous commit. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: sadeghja1070 <sadegh.ja1070@gmail.com>
Per reviewer feedback, remove embedding_modules and embedding_padding_modules as they are rarely used for LoRA fine-tuning. This simplifies the implementation while maintaining the core LoRA functionality for DeepSeek models. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: sadeghja1070 <sadegh.ja1070@gmail.com>
both requests are done now |
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
| "qkv_proj": ["q_proj", "k_proj", "v_proj"], | ||
| "gate_up_proj": ["gate_proj", "up_proj"], | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, for decoder-only models to support LoRA, you need to:
- Inherit from SupportLoRA
- Impl
packed_modules_mapping
You can take a look at those models that support LoRA.
…23971) Signed-off-by: sadeghja1070 <sadegh.ja1070@gmail.com> Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
…23971) Signed-off-by: sadeghja1070 <sadegh.ja1070@gmail.com> Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
This change follows the same pattern as Qwen MoE LoRA support (PR #20932).
🤖 Generated with Claude Code
Purpose
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.