-
Notifications
You must be signed in to change notification settings - Fork 2k
[None][doc] Add K2 tool calling examples #6667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughA new README and example Python script have been added for the Kimi-K2-Instruct model. The README details the model's tool calling capabilities and usage instructions, while the script demonstrates how to interact with the model for tool (API) calling, including parsing model outputs and invoking local functions. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant ExampleScript
participant KimiK2Model
participant LocalTool
User->>ExampleScript: Provide prompt and tool specs
ExampleScript->>KimiK2Model: Send prompt (with tool info)
KimiK2Model-->>ExampleScript: Generate tool call request
ExampleScript->>ExampleScript: Parse tool call from output
ExampleScript->>LocalTool: Invoke tool with parsed arguments
LocalTool-->>ExampleScript: Return tool result
ExampleScript->>User: Output tool call and result
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Suggested labels
Suggested reviewers
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🔭 Outside diff range comments (1)
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py (1)
160-171: Add error handling for unknown tools.The tool execution loop could fail with a KeyError if an unknown tool name is returned by the model.
Add error handling:
for tool_call in tool_calls: tool_name = tool_call['function']['name'] + if tool_name not in tool_map: + print(f"[Error]: Unknown tool '{tool_name}' requested") + continue + if args.specify_output_format: tool_arguments = tool_call['function']['arguments'] else: tool_arguments = json.loads(tool_call['function']['arguments']) tool_function = tool_map[tool_name]
🧹 Nitpick comments (8)
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py (8)
8-18: Consider breaking long prompt strings for better readability.The multi-line prompt constants exceed the 120-character line limit. While the content is important, consider breaking them into shorter lines for better maintainability.
Here's a suggestion for the first prompt:
-SPECIFY_OUTPUT_FORMAT_PROMPT = """You are an AI assistant with the role name "assistant." Based on the provided API specifications and conversation history from steps 1 to t, generate the API requests that the assistant should call in step t+1. The API requests should be output in the format [api_name(key1='value1', key2='value2', ...)], replacing api_name with the actual API name, key1, key2, etc., with the actual parameter names, and value1, value2, etc., with the actual parameter values. The output should start with a square bracket "[" and end with a square bracket "]". +SPECIFY_OUTPUT_FORMAT_PROMPT = """You are an AI assistant with the role name "assistant." Based on the provided API specifications and conversation history from steps 1 to t, generate the API requests that the assistant should call in step t+1. The API requests should be output in the format [api_name(key1='value1', key2='value2', ...)], replacing api_name with the actual API name, key1, key2, etc., with the actual parameter names, and value1, value2, etc., with the actual parameter values. The output should start with a square bracket "[" and end with a square bracket "]".Similar treatment can be applied to the second prompt and other long lines.
116-117: Fix formatting issue in system prompt construction.Line 116 has a formatting issue that makes it hard to read.
- system_prompt = SPECIFY_OUTPUT_FORMAT_PROMPT if args.specify_output_format else NOT_SPECIFY_OUTPUT_FORMAT_PROMPT.format( - tools=tools) + if args.specify_output_format: + system_prompt = SPECIFY_OUTPUT_FORMAT_PROMPT + else: + system_prompt = NOT_SPECIFY_OUTPUT_FORMAT_PROMPT.format(tools=tools)
8-22: Fix line length violations in string constants.Multiple lines exceed the 120-character limit. Consider breaking long strings into multiple lines for better readability.
Apply this diff to fix line length violations:
-SPECIFY_OUTPUT_FORMAT_PROMPT = """You are an AI assistant with the role name "assistant." Based on the provided API specifications and conversation history from steps 1 to t, generate the API requests that the assistant should call in step t+1. The API requests should be output in the format [api_name(key1='value1', key2='value2', ...)], replacing api_name with the actual API name, key1, key2, etc., with the actual parameter names, and value1, value2, etc., with the actual parameter values. The output should start with a square bracket "[" and end with a square bracket "]". -If there are multiple API requests, separate them with commas, for example: [api_name(key1='value1', key2='value2', ...), api_name(key1='value1', key2='value2', ...), ...]. Do not include any other explanations, prompts, or API call results in the output. -If the API parameter description does not specify otherwise, the parameter is optional (parameters mentioned in the user input need to be included in the output; if not mentioned, they do not need to be included). -If the API parameter description does not specify the required format for the value, use the user's original text for the parameter value. -If the API requires no parameters, output the API request directly in the format [api_name()], and do not invent any nonexistent parameter names. +SPECIFY_OUTPUT_FORMAT_PROMPT = """You are an AI assistant with the role name "assistant." Based on the provided API \ +specifications and conversation history from steps 1 to t, generate the API requests that the assistant should call in \ +step t+1. The API requests should be output in the format [api_name(key1='value1', key2='value2', ...)], replacing \ +api_name with the actual API name, key1, key2, etc., with the actual parameter names, and value1, value2, etc., with \ +the actual parameter values. The output should start with a square bracket "[" and end with a square bracket "]". +If there are multiple API requests, separate them with commas, for example: \ +[api_name(key1='value1', key2='value2', ...), api_name(key1='value1', key2='value2', ...), ...]. \ +Do not include any other explanations, prompts, or API call results in the output. +If the API parameter description does not specify otherwise, the parameter is optional (parameters mentioned in the \ +user input need to be included in the output; if not mentioned, they do not need to be included). +If the API parameter description does not specify the required format for the value, use the user's original text for \ +the parameter value. +If the API requires no parameters, output the API request directly in the format [api_name()], and do not invent any \ +nonexistent parameter names. -NOT_SPECIFY_OUTPUT_FORMAT_PROMPT = """Important: Only give the tool call requests, do not include any other explanations, prompts, or API call results in the output. -The tool call requests generated by you are wrapped by <|tool_calls_section_begin|> and <|tool_calls_section_end|>, with each tool call wrapped by <|tool_call_begin|> and <|tool_call_end|>. The tool ID and arguments are separated by <|tool_call_argument_begin|>. The format of the tool ID is functions.func_name:idx, from which we can parse the function name. +NOT_SPECIFY_OUTPUT_FORMAT_PROMPT = """Important: Only give the tool call requests, do not include any other \ +explanations, prompts, or API call results in the output. +The tool call requests generated by you are wrapped by <|tool_calls_section_begin|> and \ +<|tool_calls_section_end|>, with each tool call wrapped by <|tool_call_begin|> and <|tool_call_end|>. \ +The tool ID and arguments are separated by <|tool_call_argument_begin|>. The format of the tool ID is \ +functions.func_name:idx, from which we can parse the function name.
37-62: Fix line length violation in regex pattern.The function logic is correct and properly handles the parsing of tool call information from the model output.
Fix the line length violation on Line 47:
- func_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>" + func_call_pattern = ( + r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*" + r"<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>" + )
113-137: Fix line length violation and approve the API logic.The function correctly handles API communication and tool call parsing.
Fix the line length violation on Line 116:
- system_prompt = SPECIFY_OUTPUT_FORMAT_PROMPT if args.specify_output_format else NOT_SPECIFY_OUTPUT_FORMAT_PROMPT.format( - tools=tools) + system_prompt = ( + SPECIFY_OUTPUT_FORMAT_PROMPT + if args.specify_output_format + else NOT_SPECIFY_OUTPUT_FORMAT_PROMPT.format(tools=tools) + )
37-63: Fix line length violation in regex pattern.The function logic is solid and correctly parses the K2 tool call format. However, line 47 exceeds the 120-character limit.
- func_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>" + func_call_pattern = ( + r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*" + r"<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>" + )
65-89: Consider more specific exception handling.The function correctly parses the specified format and uses
ast.literal_evalfor safe argument evaluation. However, the broadexcept Exceptioncould be more specific.try: kwargs[k] = ast.literal_eval(v.strip()) - except Exception: + except (ValueError, SyntaxError): kwargs[k] = v.strip()
113-138: Fix line length violation and improve readability.The function correctly handles both output formats and makes appropriate API calls. However, line 116 exceeds the 120-character limit.
- system_prompt = SPECIFY_OUTPUT_FORMAT_PROMPT if args.specify_output_format else NOT_SPECIFY_OUTPUT_FORMAT_PROMPT.format( - tools=tools) + if args.specify_output_format: + system_prompt = SPECIFY_OUTPUT_FORMAT_PROMPT + else: + system_prompt = NOT_SPECIFY_OUTPUT_FORMAT_PROMPT.format(tools=tools)
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
examples/models/core/kimi_k2/README.md(1 hunks)examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py(1 hunks)
👮 Files not reviewed due to content moderation or server errors (1)
- examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: The code developed for TensorRT-LLM should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile = ...).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL = ...).
Python constants should use upper snake_case (e.g., MY_CONSTANT = ...).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a class in the constructor in Python.
For interfaces that may be used outside a file, prefer docstrings over comments in Python.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for classes and functions in Python, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the docstring for the class.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.
Files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.
Files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
🧠 Learnings (1)
📚 Learning: in tensorrt-llm testing, it's common to have both cli flow tests (test_cli_flow.py) and pytorch api ...
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Applied to files:
examples/models/core/kimi_k2/README.md
🪛 Ruff (0.12.2)
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
8-8: Line too long (580 > 120)
(E501)
9-9: Line too long (255 > 120)
(E501)
10-10: Line too long (213 > 120)
(E501)
11-11: Line too long (138 > 120)
(E501)
12-12: Line too long (145 > 120)
(E501)
17-17: Line too long (165 > 120)
(E501)
18-18: Line too long (359 > 120)
(E501)
47-47: Line too long (165 > 120)
(E501)
116-116: Line too long (124 > 120)
(E501)
🔇 Additional comments (27)
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py (13)
24-35: LGTM! Clean tool implementation and mapping.The
get_weatherfunction provides a good example for demonstration purposes, and thetool_mapdictionary provides a clean mechanism for dynamic function lookup.
37-63: LGTM! Robust parsing with good documentation reference.The
extract_tool_call_infofunction handles the K2-specific delimiter format correctly, with proper regex parsing and structured output. The reference to the HuggingFace documentation is helpful.
65-89: Good use ofast.literal_evalfor safe argument parsing.The function safely parses arguments using
ast.literal_evalwith appropriate fallback to string parsing. The regex patterns correctly handle the function call format.
91-111: LGTM! Proper tool schema definition.The function correctly defines the tool schema in OpenAI-compatible format with proper parameter specifications and type definitions.
113-138: LGTM! Well-structured request handling with good debugging output.The function properly constructs chat completion requests and handles both output formats correctly. The debug output will be valuable for users understanding the tool calling process.
140-158: LGTM! Well-configured argument parsing and client setup.The argument parsing provides good defaults for testing, and the OpenAI client configuration is appropriate for the local TensorRT-LLM server setup described in the README.
24-34: LGTM!The
get_weatherfunction and tool mapping follow proper naming conventions and provide a clean implementation for the demonstration.
65-88: LGTM!Excellent implementation with proper regex parsing, safe argument evaluation using
ast.literal_eval, and robust exception handling.
91-110: LGTM!The tool definitions follow the correct schema format and include all required fields for proper function calling integration.
158-171: LGTM!The tool execution logic properly handles both output formats and correctly calls the mapped functions with parsed arguments.
24-35: LGTM! Clean tool function implementation.The weather function follows proper naming conventions and includes case-insensitive location handling. The tool mapping dictionary provides a clean approach for dynamic function calling.
91-111: LGTM! Well-structured tool definitions.The function returns properly formatted tool definitions that follow the OpenAI standard schema with clear descriptions and parameter specifications.
140-171: LGTM! Well-structured main execution flow.The main logic correctly implements the tool calling workflow with proper argument parsing, client setup, and tool execution. The conditional handling of different argument formats between parsing modes is appropriate.
examples/models/core/kimi_k2/README.md (14)
1-21: LGTM! Clear and comprehensive overview.The overview provides excellent context about the K2 model's capabilities, and the tool calling process steps are clearly explained and align with the implementation in the example script.
44-51: LGTM! Accurate explanation of tool calling approaches.The section correctly explains both approaches for tool calling and accurately notes the TensorRT-LLM limitation, which aligns with the manual parsing implementation in the example script.
52-98: LGTM! Excellent practical examples and clear workflow.The example workflow provides clear step-by-step instructions with realistic command-line examples and expected outputs that align perfectly with the companion Python script.
99-120: LGTM! Important warnings and technical context provided.The second example demonstrates the formatted output mode well, and the warning about output format deviations provides crucial context for users working with TensorRT-LLM deployments.
1-6: LGTM!Clear and informative overview that provides essential context about the Kimi K2 model's capabilities and architecture.
7-21: LGTM!The prerequisites section clearly outlines the tool calling process and aligns perfectly with the implementation in the example script.
44-52: LGTM!Excellent explanation of the two approaches and the important limitation note about TensorRT-LLM's current capabilities.
54-98: LGTM!The example usage section provides clear, accurate commands and realistic expected outputs that align perfectly with the example script's implementation.
99-120: LGTM!Important warnings about output format deviation and parsing challenges provide crucial context for users working with TensorRT-LLM deployments.
1-6: LGTM! Clear and informative overview.The overview effectively introduces the Kimi-K2 model with key specifications and capabilities, providing good context for users.
44-53: LGTM! Clear explanation of tool calling approaches.This section effectively communicates the available approaches and the important limitation that TensorRT-LLM requires manual parsing. The information is crucial for users to understand the workflow.
54-67: LGTM! Standard server deployment instructions.The server launch instructions follow the expected TensorRT-LLM deployment pattern with appropriate configuration options.
68-98: LGTM! Comprehensive and practical examples.The usage examples provide clear command-line instructions with expected outputs, effectively demonstrating the tool calling workflow. The step-by-step explanations help users understand the process.
99-120: LGTM! Important warnings about format limitations.The second example demonstrates the alternative output format effectively, and the note about potential format deviations is crucial information for users deploying with TensorRT-LLM.
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
5041666 to
1dd0ab3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py (1)
177-177: Fix argparse boolean argument handling.Using
type=boolwith argparse doesn't work as expected - it will always beTrueif any value is provided, even "False".Apply this diff to fix the boolean argument:
- parser.add_argument("--specify_output_format", type=bool, default=False) + parser.add_argument("--specify_output_format", action="store_true", default=False)
🧹 Nitpick comments (3)
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py (3)
25-50: Fix remaining line length violations.Several lines still exceed the 120-character limit despite the line continuation approach. Lines 30, 35, 36, 37, 45, and 46 need additional breaks.
Apply these fixes to resolve the remaining line length violations:
-and value1, value2, etc., with the actual parameter values. The output should start with a square bracket "[" and end with a square bracket "]". +and value1, value2, etc., with the actual parameter values. \ +The output should start with a square bracket "[" and end with a square bracket "]".-(parameters mentioned in the user input need to be included in the output; if not mentioned, they do not need to be included). +(parameters mentioned in the user input need to be included in the output; \ +if not mentioned, they do not need to be included).-If the API parameter description does not specify the required format for the value, use the user's original text for the parameter value. \ +If the API parameter description does not specify the required format for the value, \ +use the user's original text for the parameter value. \-If the API requires no parameters, output the API request directly in the format [api_name()], and do not invent any nonexistent parameter names. +If the API requires no parameters, output the API request directly in the format [api_name()], \ +and do not invent any nonexistent parameter names.-<|tool_calls_section_begin|> and <|tool_calls_section_end|>, with each tool call wrapped by <|tool_call_begin|> and <|tool_call_end|>. \ +<|tool_calls_section_begin|> and <|tool_calls_section_end|>, \ +with each tool call wrapped by <|tool_call_begin|> and <|tool_call_end|>. \-The tool ID and arguments are separated by <|tool_call_argument_begin|>. The format of the tool ID is functions.func_name:idx, \ +The tool ID and arguments are separated by <|tool_call_argument_begin|>. \ +The format of the tool ID is functions.func_name:idx, \
76-76: Fix line length violation.Line 76 exceeds the 120-character limit.
Apply this fix:
- func_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>" + func_call_pattern = (r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*" + r"<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*" + r"<\|tool_call_end\|>")
145-146: Fix line length violation.Line 145 exceeds the 120-character limit.
Apply this fix:
- system_prompt = SPECIFY_OUTPUT_FORMAT_PROMPT if args.specify_output_format else NOT_SPECIFY_OUTPUT_FORMAT_PROMPT.format( - tools=tools) + system_prompt = (SPECIFY_OUTPUT_FORMAT_PROMPT if args.specify_output_format + else NOT_SPECIFY_OUTPUT_FORMAT_PROMPT.format(tools=tools))
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
examples/models/core/kimi_k2/README.md(1 hunks)examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- examples/models/core/kimi_k2/README.md
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: The code developed for TensorRT-LLM should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile = ...).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL = ...).
Python constants should use upper snake_case (e.g., MY_CONSTANT = ...).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a class in the constructor in Python.
For interfaces that may be used outside a file, prefer docstrings over comments in Python.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for classes and functions in Python, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the docstring for the class.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.
Files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.
Files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
🧠 Learnings (4)
📚 Learning: applies to **/*.{cpp,h,hpp,cc,cxx,cu,py} : all tensorrt-llm open source software code should contain...
Learnt from: CR
PR: NVIDIA/TensorRT-LLM#0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-08-06T08:45:40.701Z
Learning: Applies to **/*.{cpp,h,hpp,cc,cxx,cu,py} : All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.
Applied to files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
📚 Learning: applies to **/*.py : the code developed for tensorrt-llm should conform to python 3.8+....
Learnt from: CR
PR: NVIDIA/TensorRT-LLM#0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-08-06T08:45:40.701Z
Learning: Applies to **/*.py : The code developed for TensorRT-LLM should conform to Python 3.8+.
Applied to files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
📚 Learning: applies to **/*.{h,hpp} : use a preprocessor guard in header files. the guard name must have prefix ...
Learnt from: CR
PR: NVIDIA/TensorRT-LLM#0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-08-06T08:45:40.701Z
Learning: Applies to **/*.{h,hpp} : Use a preprocessor guard in header files. The guard name must have prefix TRTLLM_ followed by the filename, all in caps, and no trailing underscore.
Applied to files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
📚 Learning: in tensorrt-llm testing, it's common to have both cli flow tests (test_cli_flow.py) and pytorch api ...
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Applied to files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
🪛 Ruff (0.12.2)
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
30-30: Line too long (144 > 120)
(E501)
35-35: Line too long (126 > 120)
(E501)
36-36: Line too long (140 > 120)
(E501)
37-37: Line too long (145 > 120)
(E501)
45-45: Line too long (136 > 120)
(E501)
46-46: Line too long (128 > 120)
(E501)
76-76: Line too long (165 > 120)
(E501)
145-145: Line too long (124 > 120)
(E501)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (8)
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py (8)
1-17: LGTM! Copyright header is correctly included.The required NVIDIA copyright header with the current year is properly included at the top of the file, complying with the coding guidelines.
18-23: LGTM! Imports follow Python conventions.The imports maintain proper namespace usage and follow the coding guidelines for TensorRT-LLM.
53-63: LGTM! Tool function and mapping are well-implemented.The weather function provides a good example with simple hardcoded responses, and the tool mapping enables dynamic function calls as intended.
67-91: LGTM! Tool call extraction logic is well-implemented.The function correctly parses the custom delimiter format and handles multiple tool calls appropriately. The regex patterns and parsing logic are sound.
94-117: LGTM! Specified format parsing is robust.The function correctly parses function call syntax using regex and safely evaluates arguments with AST literal evaluation, including proper fallback error handling.
120-139: LGTM! Tool specification follows OpenAI format correctly.The tool specification is properly structured with all required fields for function calling.
142-166: LGTM! Request orchestration is well-implemented.The function correctly handles both output formats, constructs appropriate messages, and parses responses using the right parsing function based on the format.
169-176: LGTM! Main execution logic is well-structured.The argument parsing, client setup, and tool execution flow are properly implemented. The script correctly handles both output formats and provides clear output for debugging.
Also applies to: 178-199
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py (3)
25-50: Address remaining line length violations in prompt strings.Several lines still exceed the 120-character limit despite previous improvements. Consider further breaking down the longer lines:
and value1, value2, etc., with the actual parameter values. The output should start with a square bracket "[" and end with a square bracket "]". -If there are multiple API requests, separate them with commas, for example: \ +If there are multiple API requests, separate them with commas, for example: \ [api_name(key1='value1', key2='value2', ...), api_name(key1='value1', key2='value2', ...), ...]. \ -Do not include any other explanations, prompts, or API call results in the output. -If the API parameter description does not specify otherwise, the parameter is optional \ -(parameters mentioned in the user input need to be included in the output; if not mentioned, they do not need to be included). -If the API parameter description does not specify the required format for the value, use the user's original text for the parameter value. \ -If the API requires no parameters, output the API request directly in the format [api_name()], and do not invent any nonexistent parameter names. +Do not include any other explanations, prompts, or API call results in the output. +If the API parameter description does not specify otherwise, the parameter is optional \ +(parameters mentioned in the user input need to be included in the output; \ +if not mentioned, they do not need to be included). +If the API parameter description does not specify the required format for the value, \ +use the user's original text for the parameter value. +If the API requires no parameters, output the API request directly in the format [api_name()], \ +and do not invent any nonexistent parameter names.
76-76: Fix line length violation in regex pattern.The regex pattern exceeds the 120-character limit.
- func_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>" + func_call_pattern = ( + r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*" + r"<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>" + )
145-146: Fix line length violation in system prompt assignment.The conditional assignment exceeds the line limit.
- system_prompt = SPECIFY_OUTPUT_FORMAT_PROMPT if args.specify_output_format else NOT_SPECIFY_OUTPUT_FORMAT_PROMPT.format( - tools=tools) + system_prompt = ( + SPECIFY_OUTPUT_FORMAT_PROMPT if args.specify_output_format + else NOT_SPECIFY_OUTPUT_FORMAT_PROMPT.format(tools=tools) + )
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
examples/models/core/kimi_k2/README.md(1 hunks)examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/models/core/kimi_k2/README.md
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: The code developed for TensorRT-LLM should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile = ...).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL = ...).
Python constants should use upper snake_case (e.g., MY_CONSTANT = ...).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a class in the constructor in Python.
For interfaces that may be used outside a file, prefer docstrings over comments in Python.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for classes and functions in Python, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the docstring for the class.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.
Files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.
Files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
🧠 Learnings (5)
📚 Learning: applies to **/*.{cpp,h,hpp,cc,cxx,cu,py} : all tensorrt-llm open source software code should contain...
Learnt from: CR
PR: NVIDIA/TensorRT-LLM#0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-08-06T08:45:40.701Z
Learning: Applies to **/*.{cpp,h,hpp,cc,cxx,cu,py} : All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.
Applied to files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
📚 Learning: in tensorrt-llm, test files (files under tests/ directories) do not require nvidia copyright headers...
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.495Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
Applied to files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
📚 Learning: applies to **/*.py : the code developed for tensorrt-llm should conform to python 3.8+....
Learnt from: CR
PR: NVIDIA/TensorRT-LLM#0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-08-06T08:45:40.701Z
Learning: Applies to **/*.py : The code developed for TensorRT-LLM should conform to Python 3.8+.
Applied to files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
📚 Learning: applies to **/*.{h,hpp} : use a preprocessor guard in header files. the guard name must have prefix ...
Learnt from: CR
PR: NVIDIA/TensorRT-LLM#0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-08-06T08:45:40.701Z
Learning: Applies to **/*.{h,hpp} : Use a preprocessor guard in header files. The guard name must have prefix TRTLLM_ followed by the filename, all in caps, and no trailing underscore.
Applied to files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
📚 Learning: in tensorrt-llm testing, it's common to have both cli flow tests (test_cli_flow.py) and pytorch api ...
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Applied to files:
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
🪛 Ruff (0.12.2)
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py
30-30: Line too long (144 > 120)
(E501)
35-35: Line too long (126 > 120)
(E501)
36-36: Line too long (140 > 120)
(E501)
37-37: Line too long (145 > 120)
(E501)
45-45: Line too long (136 > 120)
(E501)
46-46: Line too long (128 > 120)
(E501)
76-76: Line too long (165 > 120)
(E501)
145-145: Line too long (124 > 120)
(E501)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (8)
examples/models/core/kimi_k2/kimi_k2_tool_calling_example.py (8)
1-17: LGTM! Copyright header properly implemented.The NVIDIA copyright header is correctly formatted with the current year and Apache 2.0 license, meeting the coding guidelines requirement.
18-23: LGTM! Import organization follows best practices.The imports are properly organized with standard library imports first, followed by third-party imports, and maintain proper namespace usage as required by the coding guidelines.
53-63: LGTM! Clean example tool implementation.The
get_weatherfunction and tool mapping follow proper Python conventions with snake_case naming, type hints, and clear logic that serves well as a demonstration tool for the example.
67-91: LGTM! Well-structured parsing for K2 model format.The
extract_tool_call_infofunction correctly handles the custom delimiter format specific to the K2 model with proper regex parsing and structured output.
94-117: LGTM! Robust parsing with proper error handling.The
parse_specified_format_tool_callsfunction implements solid regex parsing with appropriate error handling for malformed arguments usingast.literal_evalwith fallback.
120-139: LGTM! Proper tool schema definition.The
get_toolsfunction correctly defines the tool schema in OpenAI format with appropriate type definitions, required parameters, and clear descriptions that match the actualget_weatherimplementation.
142-166: LGTM! Well-orchestrated tool calling workflow.The function properly handles both output formats, makes appropriate API calls, and provides good debugging visibility. The logic flow is clear and correct.
169-201: LGTM! Clean main execution with proper argument handling.The main block demonstrates proper usage patterns with correct argparse boolean handling (fixed from previous review), appropriate client configuration, and clear result processing for both output formats.
|
/bot run |
|
PR_Github #14350 [ run ] triggered by Bot |
|
PR_Github #14350 [ run ] completed with state |
|
/bot run |
|
PR_Github #14355 [ run ] triggered by Bot |
|
PR_Github #14355 [ run ] completed with state |
|
/bot run |
|
PR_Github #14375 [ run ] triggered by Bot |
|
PR_Github #14375 [ run ] completed with state |
|
/bot run |
|
PR_Github #14536 [ run ] triggered by Bot |
|
PR_Github #14536 [ run ] completed with state |
|
overall LGTM. Added a few comments. |
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
|
/bot run |
|
PR_Github #14619 [ run ] triggered by Bot |
|
PR_Github #14619 [ run ] completed with state |
This reverts commit a2e9153.


This pull request introduces comprehensive documentation and an example script for tool calling with the Kimi-K2 model, focusing on TensorRT-LLM deployments. The changes provide practical guidance and code for parsing and handling tool-call requests, including output format handling and manual parsing when guided decoding is unavailable.
Summary by CodeRabbit
New Features
Documentation