-
-
Notifications
You must be signed in to change notification settings - Fork 12.1k
[Bugfix] Add support for <tool_call> format in streaming mode for XLAM Tool Parser
#22769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Add support for <tool_call> format in streaming mode for XLAM Tool Parser
#22769
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly adds support for the <tool_call> format in streaming mode by leveraging the existing preprocess_model_output method. The changes to the parser logic are sound. The accompanying tests have been updated, but the new streaming test is confusing and could be improved for clarity and correctness. I've provided a suggestion to refactor the test to be more straightforward and representative of a streaming scenario.
b394d10 to
2c12b4d
Compare
|
@aarnphm Is there anything else you need from me? Or do I just sit tight? |
Can u check the CI failure here? |
…ming mode with the XLAM tool parser Signed-off-by: Devon Peroutky <devon@kindo.ai>
Head branch was pushed to by a user without write access
0e2a07f to
d199a97
Compare
Signed-off-by: Devon Peroutky <devon@kindo.ai>
d199a97 to
ec50893
Compare
…ling Signed-off-by: Devon Peroutky <devon@kindo.ai>
ec50893 to
26f11d8
Compare
|
All the tests are passing now! I modified the implementation slightly since you first approved it, FYI. However, I've add a lot more comprehensive tests for each possible XLAM format to account for the changes. |
DarkLight1337
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and sorry for the delay
…LAM Tool Parser (vllm-project#22769) Signed-off-by: Devon Peroutky <devon@kindo.ai>
…LAM Tool Parser (vllm-project#22769) Signed-off-by: Devon Peroutky <devon@kindo.ai>
…LAM Tool Parser (vllm-project#22769) Signed-off-by: Devon Peroutky <devon@kindo.ai>
Issue
The XLAM Tool Parser Documentation states it supports the following formats:
However, the
<tool_call>...</tool_call>tag format is broken/not-supported in streaming mode. The only output format that is supported in streaming mode is "Direct JSON arrays".Additionally, none of the formats support any preamble before the tool call. Example:
This PR fix that as well for all formats.
Proposed Plan
This PR adds support for the
<tool_call>...</tool_call>tag output format in streaming mode for the XLAM Tool Parser, while maintaining the existing functionality for every other output format.Test Plan
I've added additional unit test cases for both synchronous and steaming-mode, to verify all possible XLAM formats are parsed correctly. The existing tests should catch any possible regressions.
Test Results
Local
BuildKite
TBD