[Bugfix]: fix issue with n>1 sampling on v1 requests overriding each other#16863
[Bugfix]: fix issue with n>1 sampling on v1 requests overriding each other#16863DarkLight1337 merged 2 commits intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
6d81806 to
c9e7c0c
Compare
Signed-off-by: Jeffrey Li <jeffrey.dot.li@gmail.com>
c9e7c0c to
640db51
Compare
There was a problem hiding this comment.
Thanks for catching this @jeffrey-dot-li!
Would you be willing to also update the unit test to cover this case?
vllm/tests/v1/engine/test_output_processor.py
Line 842 in 3d3ab36
|
Thanks @jeffrey-dot-li! Could you revert the unrelated formatting changes? |
Signed-off-by: Jeffrey Li <jeffrey.dot.li@gmail.com>
74306de to
1f562ef
Compare
Fixed sorry about that thought it would be caught in precommit |
…other (vllm-project#16863) Signed-off-by: Jeffrey Li <jeffrey.dot.li@gmail.com> Signed-off-by: Frieda (Jingying) Huang <jingyingfhuang@gmail.com>
…other (vllm-project#16863) Signed-off-by: Jeffrey Li <jeffrey.dot.li@gmail.com>
…other (vllm-project#16863) Signed-off-by: Jeffrey Li <jeffrey.dot.li@gmail.com>
…other (vllm-project#16863) Signed-off-by: Jeffrey Li <jeffrey.dot.li@gmail.com> Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
…other (vllm-project#16863) Signed-off-by: Jeffrey Li <jeffrey.dot.li@gmail.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
Fixes issue with n>1 sampling on v1 requests overriding each other
FIX #12584 #14280
MRE:
Before: (See how CompletionOutput.index is always 1 after the first token)
After (See how we get both index=0 and index=1 all the way to the end):