[Feature] add model aware kv ops helper#16020
Conversation
Signed-off-by: billishyahao <bill.he@amd.com>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
KuntaiDu
left a comment
There was a problem hiding this comment.
I like this PR! Some comments on naming stuff, but functionality LGTM!
vllm/distributed/kv_transfer/kv_connector/mooncake_store_connector.py
Outdated
Show resolved
Hide resolved
vllm/distributed/kv_transfer/kv_connector/mooncake_store_connector.py
Outdated
Show resolved
Hide resolved
vllm/distributed/kv_transfer/kv_connector/model_aware_kv_ops.py
Outdated
Show resolved
Hide resolved
ShangmingCai
left a comment
There was a problem hiding this comment.
I like this PR to modularize this part of the code to reduce duplication and adapt to all connectors, but model_aware_kv_ops.py this filename seems a bit confusing, maybe it should be placed in a utils.py file. Otherwise, LGTM.
vllm/distributed/kv_transfer/kv_connector/mooncake_store_connector.py
Outdated
Show resolved
Hide resolved
Signed-off-by: billishyahao <bill.he@amd.com>
ShangmingCai
left a comment
There was a problem hiding this comment.
LGTM. But maybe a shorter name like "utils.py" will be better? So that we can put more util functions or helpers all in this file as well instead of creating so many files in the future. I suggest this because I see some "utils.py" in many sub-directories of vllm.
Yes, it makes sense. I rename it in latest commit 07c73ea . Thanks! |
Signed-off-by: billishyahao <bill.he@amd.com>
ShangmingCai
left a comment
There was a problem hiding this comment.
@billishyahao LGTM now. You can ping @KuntaiDu to review it again.
|
Can you merge from main to fix the CI failures? |
|
@DarkLight1337 This probably could use a force-merge since it only changes the files under vllm/distributed/kv_transfer/kv_connector dir, and the disaggregated serving feature doesn't have CI yet. |
Signed-off-by: billishyahao <bill.he@amd.com>
Signed-off-by: billishyahao <bill.he@amd.com>
Signed-off-by: billishyahao <bill.he@amd.com> Signed-off-by: Yang Wang <elainewy@meta.com>
Signed-off-by: billishyahao <bill.he@amd.com>
Signed-off-by: billishyahao <bill.he@amd.com>
Signed-off-by: billishyahao <bill.he@amd.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
This patch provides
Tested on both AMD and NVIDIA DCGPUs to verify its correctness on both simple connector 1P1D and mooncake store connector XPYD case.
XPYD:
1P1D