fix: balance callback lifecycle for hallucinated tool calls by giulio-leone · Pull Request #4998 · google/adk-python

giulio-leone · 2026-03-25T19:22:02Z

Summary

When an LLM hallucinates a tool name that doesn't exist in tools_dict, _get_tool() raises ValueError. Previously, on_tool_error_callback fired immediately — before before_tool_callback and outside the OpenTelemetry tracer span. This caused plugins that push/pop spans (e.g. BigQueryAgentAnalyticsPlugin's TraceManager) to pop the parent agent span, corrupting the trace stack for all subsequent tool calls in the session.

Root Cause

The callback lifecycle contract is:

before_tool_callback → (tool execution OR on_tool_error_callback) → after_tool_callback

For hallucinated tools, the old code path was:

on_tool_error_callback → (return or raise)  # before_tool_callback never called!

This violated the lifecycle invariant — plugins that push a span in before_tool_callback never get to push, but on_tool_error_callback still pops, corrupting the stack.

Fix

Move the ValueError handling inside _run_with_trace() so that:

before_tool_callback always fires first (balanced push)
The error is surfaced within the OTel span context
on_tool_error_callback fires after before_tool_callback

Applied to both handle_function_calls_async and handle_function_calls_live code paths.

Testing

2 new tests in test_plugin_tool_callbacks.py:
- test_hallucinated_tool_fires_before_and_error_callbacks: Verifies callback order (before → error)
- test_hallucinated_tool_raises_when_no_error_callback: Verifies ValueError propagates correctly
All 12 callback tests pass
Full unit test suite: 4727 passed, 0 regressions

⚠️ This reopens #4808 which was accidentally closed due to fork deletion.

When an LLM hallucinates a tool name, _get_tool() raises ValueError. Previously, on_tool_error_callback fired immediately — before before_tool_callback and outside the OTel tracer span. This caused plugins that push/pop spans (e.g. BigQueryAgentAnalyticsPlugin's TraceManager) to pop the parent agent span, corrupting the trace stack for all subsequent tool calls. Move the ValueError handling inside _run_with_trace() so that: 1. before_tool_callback always fires first (balanced push) 2. The error is surfaced within the OTel span context 3. on_tool_error_callback fires after before_tool_callback Applied to both handle_function_calls_async and handle_function_calls_live code paths. Fixes google#4775

giulio-leone added 2 commits March 21, 2026 02:56

Balance hallucinated tool callback lifecycle

b868aee

adk-bot added the core [Component] This issue is related to the core interface and implementation label Mar 25, 2026

rohityan self-assigned this Mar 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: balance callback lifecycle for hallucinated tool calls#4998

fix: balance callback lifecycle for hallucinated tool calls#4998
giulio-leone wants to merge 2 commits intogoogle:mainfrom
giulio-leone:fix/hallucinated-tool-callback-lifecycle

giulio-leone commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

giulio-leone commented Mar 25, 2026

Summary

Root Cause

Fix

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants