-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Is your feature request related to a specific problem?
ADK currently has no Redis-backed session service and lacks runtime orchestration primitives for production agent deployments. Specifically:
- No Redis session backend, existing options are InMemory (lost on restart), SQLite (single-node), Database (heavy), and VertexAI (vendor-locked). Redis is the standard for distributed, low-latency session state but is missing. (Add support for additional Memory Bank services: DatabaseMemoryService/RedisMemoryService #2524)
- No external abort/kill mechanism — there is no way to stop a running agent mid-execution from outside. Users have been requesting this since Feature Request: Add an Endpoint to Explicitly Stop/Terminate a Conversation #1621 and again in "Stop generating" — ability to stop
run_async()from outside the agent #4796. In production, when an agent goes off-rails, you need an immediate kill switch, not a graceful timeout. - No crash recovery, if a process dies mid-task, the task state is lost. There's no mechanism to detect orphaned tasks on restart and recover or fail them cleanly.
- No task lifecycle state tracking, no built-in way to track whether a task is running, completed, failed, timed out, or aborted.
Describe the Solution You'd Like
A RedisSessionService implementing BaseSessionService and a RedisOrchestrationPlugin extending BasePlugin that provides:
- Redis session service: create_session, get_session, list_sessions, delete_session, append_event backed by Redis with configurable TTL
- Task state machine: running → completed / failed / timed_out / aborted, tracked in Redis with Pub/Sub notifications
- External abort: publish to a Redis channel to kill a running agent mid-execution. Not a polite stop — an immediate process-level kill
- Crash recovery: on startup, scan for tasks stuck in "running" state and mark them failed with a recovery message
Plugin hooks mapping:
- before_run_callback → register task as RUNNING
- after_run_callback → mark COMPLETED/FAILED
- before_agent_callback → check abort signal
- on_model_error_callback → handle crash recovery
Impact on your work
I run multi-agent systems in managing infrastructure via Telegram, Discord, Slack, and web interfaces. Without these primitives, every production ADK deployment needs to build them from scratch. I've already built and battle-tested this with different frame works with 100 concurrent agents across 10,000 rounds. See my github
The agent safety model (dual-gate firewall with deterministic denylist + independent LLM judge) is documented in a separate IETF Internet-Draft: https://datatracker.ietf.org/doc/draft-baysal-asimov-safety-architecture/
Willingness to contribute
Yes. I have a working implementation ready to port to the adk-python. This addresses #2524, #4796, and #1621.
Describe Alternatives You've Considered
- SQLite/Database session service — exists but doesn't provide the low-latency Pub/Sub needed for real-time abort signals
- InMemory session service — lost on restart, not suitable for production
- Building it outside ADK — works but fragments the ecosystem. This belongs in the framework.
Proposed API / Implementation
Session service
from google.adk.sessions import RedisSessionService
session_service = RedisSessionService(
redis_url="redis://localhost:6379",
key_prefix="adk:",
session_ttl=3600,
)
Orchestration plugin
from google.adk.plugins import RedisOrchestrationPlugin
plugin = RedisOrchestrationPlugin(
redis_url="redis://localhost:6379",
enable_state_tracking=True,
enable_abort=True,
enable_crash_recovery=True,
)
App with both
from google.adk.apps import App
app = App(
name="my_app",
agent=my_agent,
plugins=[plugin],
)
Runner
runner = Runner(
app=app,
session_service=session_service,
)
External abort from anywhere
import redis
r = redis.from_url("redis://localhost:6379")
r.publish("adk:abort:task-123", '{"action": "abort"}')
Additional Context
- Addresses three open issues: Add support for additional Memory Bank services: DatabaseMemoryService/RedisMemoryService #2524 (Redis memory/session), "Stop generating" — ability to stop
run_async()from outside the agent #4796 (stop run_async externally), Feature Request: Add an Endpoint to Explicitly Stop/Terminate a Conversation #1621 (terminate conversation) - Battle-tested at scale: 100 concurrent agents, 10,000 rounds, with HMAC-signed results
- Zero new dependencies beyond redis-py (optional install)