[router][grpc] Implement tool_choice support for Responses API #12668

CatherineSue · 2025-11-05T04:08:48Z

Motivation

The Responses API (/v1/responses) did not support the tool_choice parameter, which is a critical feature for controlling tool calling behavior according to OpenAI's Responses API specification. This limitation prevented users from:

Forcing the model to call specific tools (tool_choice: {type: "function", function: {name: "..."}})
Requiring at least one tool call (tool_choice: "required")
Restricting tool calls to a subset of available tools (tool_choice: {type: "allowed_tools", ...})
Preventing tool calls entirely (tool_choice: "none")

This PR implements full tool_choice support for the Responses API in both Harmony and Regular routers, following the same pattern as the existing Chat Completion API implementation.

Modifications

1. Harmony Router - Responses API tool_choice Support

File: src/routers/grpc/harmony/stages/preparation.rs

Added tool_choice constraint generation for Responses API requests
Extracts both Function and MCP tools from ResponseTools (schemas populated before pipeline)
Implements AllowedTools filtering to restrict tools based on user's tool_choice
Generates Harmony structural tags using triggered_tags format for tool constraints
Supports all tool_choice modes: none, auto, required, specific function, allowed_tools

2. Regular Router - Responses API tool_choice Support

Files:

src/routers/grpc/regular/responses/conversions.rs
src/routers/grpc/regular/responses/tool_loop.rs

Flow:

conversions.rs: Extracts function tools and passes through tool_choice unchanged
Without MCP: Function tools → chat pipeline → tool_choice constraints applied
With MCP: Tool loop merges function + MCP tools → chat pipeline → constraints applied to ALL tools
Loop iteration logic: Iteration 0 uses user's tool_choice, iteration 1+ uses "auto" to prevent infinite loops

3. ToolReference Protocol Enhancement

File: src/protocols/common.rs

Converted ToolReference from a simple struct to a tagged enum to properly support different tool types:

// Before (struct)
pub struct ToolReference {
    pub tool_type: String,
    pub name: String,
}

// After (enum)
pub enum ToolReference {
    Function { name: String },
    Mcp { server_label: String, name: Option<String> },
    FileSearch,
    WebSearchPreview,
    ComputerUsePreview,
    CodeInterpreter,
    ImageGeneration,
}

Benefits:

Type-safe representation of different tool types
Supports MCP and hosted tools (not just functions)
Helper methods: identifier(), function_name()

4. Chat API Validation

File: src/protocols/chat.rs

Added validation that Chat Completion API ONLY accepts Function type ToolReference in tool_choice. Rejects MCP and hosted tools with clear error messages, enforcing the API contract.

5. Code Deduplication

File: src/routers/grpc/common/responses/utils.rs

Created shared utility function extract_tools_from_response_tools() to eliminate duplication:

Used by both Harmony preparation stage and Regular conversions
include_mcp parameter controls whether to extract MCP tools
Comprehensive documentation explains usage differences between routers

6. Helper Function for Chat Requests

File: src/routers/grpc/regular/responses/tool_loop.rs

Created prepare_chat_tools_and_choice() helper to:

Merge function tools from request with MCP tools
Set tool_choice based on iteration (user's choice on iteration 0, "auto" on iteration 1+)
Reduces duplication in tool loop

7. Bug Fixes

Fixed unnecessary pub use: Changed pub use crate::tokenizer::StopSequenceDecoder to regular use in utils.rs
Updated test cases: Fixed 6 test functions in tests/spec/chat_completion.rs to use new ToolReference enum syntax

8. Documentation

Added comprehensive comments explaining:

Tool extraction flow in conversions.rs (why MCP path "wastes" initial extraction)
Differences between Harmony and Regular router tool handling
Tool loop iteration behavior with tool_choice

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

- Further simplify filtering by adding a `filter_tools_by_tool_choice`

gemini-code-assist · 2025-11-05T04:08:51Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

CatherineSue added 3 commits November 4, 2025 18:55

Support ToolChoice for responses API

9ef3771

Pass tool call constraint to responses in sglang_scheduler client

c3fdbe8

Add validation for tool choice in Responses

bb7307f

- Further simplify filtering by adding a `filter_tools_by_tool_choice`

CatherineSue added the high priority label Nov 5, 2025

CatherineSue requested a review from key4ng as a code owner November 5, 2025 04:08

CatherineSue added the router label Nov 5, 2025

CatherineSue requested review from ByronHsu and slin1237 as code owners November 5, 2025 04:08

sglang-bot added the run-ci label Nov 5, 2025

CatherineSue added the enhancement New feature or request label Nov 5, 2025

slin1237 approved these changes Nov 5, 2025

View reviewed changes

slin1237 merged commit 9f5e701 into main Nov 5, 2025
99 of 127 checks passed

slin1237 deleted the chang/responses-fix branch November 5, 2025 05:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[router][grpc] Implement tool_choice support for Responses API #12668

[router][grpc] Implement tool_choice support for Responses API #12668

Uh oh!

CatherineSue commented Nov 5, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Nov 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[router][grpc] Implement tool_choice support for Responses API #12668

[router][grpc] Implement tool_choice support for Responses API #12668

Uh oh!

Conversation

CatherineSue commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

1. Harmony Router - Responses API tool_choice Support

2. Regular Router - Responses API tool_choice Support

3. ToolReference Protocol Enhancement

4. Chat API Validation

5. Code Deduplication

6. Helper Function for Chat Requests

7. Bug Fixes

8. Documentation

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist bot commented Nov 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CatherineSue commented Nov 5, 2025 •

edited

Loading