Skip to content

Conversation

@CatherineSue
Copy link
Collaborator

@CatherineSue CatherineSue commented Nov 5, 2025

Motivation

The Responses API (/v1/responses) did not support the tool_choice parameter, which is a critical feature for controlling tool calling behavior according to OpenAI's Responses API specification. This limitation prevented users from:

  • Forcing the model to call specific tools (tool_choice: {type: "function", function: {name: "..."}})
  • Requiring at least one tool call (tool_choice: "required")
  • Restricting tool calls to a subset of available tools (tool_choice: {type: "allowed_tools", ...})
  • Preventing tool calls entirely (tool_choice: "none")

This PR implements full tool_choice support for the Responses API in both Harmony and Regular routers, following the same pattern as the existing Chat Completion API implementation.

Modifications

1. Harmony Router - Responses API tool_choice Support

File: src/routers/grpc/harmony/stages/preparation.rs

  • Added tool_choice constraint generation for Responses API requests
  • Extracts both Function and MCP tools from ResponseTools (schemas populated before pipeline)
  • Implements AllowedTools filtering to restrict tools based on user's tool_choice
  • Generates Harmony structural tags using triggered_tags format for tool constraints
  • Supports all tool_choice modes: none, auto, required, specific function, allowed_tools

2. Regular Router - Responses API tool_choice Support

Files:

  • src/routers/grpc/regular/responses/conversions.rs
  • src/routers/grpc/regular/responses/tool_loop.rs

Flow:

  • conversions.rs: Extracts function tools and passes through tool_choice unchanged
  • Without MCP: Function tools → chat pipeline → tool_choice constraints applied
  • With MCP: Tool loop merges function + MCP tools → chat pipeline → constraints applied to ALL tools
  • Loop iteration logic: Iteration 0 uses user's tool_choice, iteration 1+ uses "auto" to prevent infinite loops

3. ToolReference Protocol Enhancement

File: src/protocols/common.rs

Converted ToolReference from a simple struct to a tagged enum to properly support different tool types:

// Before (struct)
pub struct ToolReference {
    pub tool_type: String,
    pub name: String,
}

// After (enum)
pub enum ToolReference {
    Function { name: String },
    Mcp { server_label: String, name: Option<String> },
    FileSearch,
    WebSearchPreview,
    ComputerUsePreview,
    CodeInterpreter,
    ImageGeneration,
}

Benefits:

  • Type-safe representation of different tool types
  • Supports MCP and hosted tools (not just functions)
  • Helper methods: identifier(), function_name()

4. Chat API Validation

File: src/protocols/chat.rs

Added validation that Chat Completion API ONLY accepts Function type ToolReference in tool_choice. Rejects MCP and hosted tools with clear error messages, enforcing the API contract.

5. Code Deduplication

File: src/routers/grpc/common/responses/utils.rs

Created shared utility function extract_tools_from_response_tools() to eliminate duplication:

  • Used by both Harmony preparation stage and Regular conversions
  • include_mcp parameter controls whether to extract MCP tools
  • Comprehensive documentation explains usage differences between routers

6. Helper Function for Chat Requests

File: src/routers/grpc/regular/responses/tool_loop.rs

Created prepare_chat_tools_and_choice() helper to:

  • Merge function tools from request with MCP tools
  • Set tool_choice based on iteration (user's choice on iteration 0, "auto" on iteration 1+)
  • Reduces duplication in tool loop

7. Bug Fixes

  • Fixed unnecessary pub use: Changed pub use crate::tokenizer::StopSequenceDecoder to regular use in utils.rs
  • Updated test cases: Fixed 6 test functions in tests/spec/chat_completion.rs to use new ToolReference enum syntax

8. Documentation

Added comprehensive comments explaining:

  • Tool extraction flow in conversions.rs (why MCP path "wastes" initial extraction)
  • Differences between Harmony and Regular router tool handling
  • Tool loop iteration behavior with tool_choice

Accuracy Tests

Screenshot 2025-11-04 at 7 54 07 PM

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@CatherineSue CatherineSue added the enhancement New feature or request label Nov 5, 2025
@slin1237 slin1237 merged commit 9f5e701 into main Nov 5, 2025
99 of 127 checks passed
@slin1237 slin1237 deleted the chang/responses-fix branch November 5, 2025 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants