Redis-backed session service and orchestration plugin (state tracking, abort, crash recovery)

Is your feature request related to a specific problem?

ADK currently has no Redis-backed session service and lacks runtime orchestration primitives for production agent deployments. Specifically:

1. No Redis session backend, existing options are InMemory (lost on restart), SQLite (single-node), Database (heavy), and VertexAI (vendor-locked). Redis is the standard for distributed, low-latency session state but is missing. (#2524)
2. No external abort/kill mechanism — there is no way to stop a running agent mid-execution from outside. Users have been requesting this since #1621 and again in #4796. In production, when an agent goes off-rails, you need an immediate kill switch, not a graceful timeout.
3. No crash recovery,  if a process dies mid-task, the task state is lost. There's no mechanism to detect orphaned tasks on restart and recover or fail them cleanly.
4. No task lifecycle state tracking, no built-in way to track whether a task is running, completed, failed, timed out, or aborted.

Describe the Solution You'd Like

A RedisSessionService implementing BaseSessionService and a RedisOrchestrationPlugin extending BasePlugin that provides:

- Redis session service: create_session, get_session, list_sessions, delete_session, append_event backed by Redis with configurable TTL
- Task state machine: running → completed / failed / timed_out / aborted, tracked in Redis with Pub/Sub notifications
- External abort: publish to a Redis channel to kill a running agent mid-execution. Not a polite stop — an immediate process-level kill
- Crash recovery: on startup, scan for tasks stuck in "running" state and mark them failed with a recovery message

Plugin hooks mapping:
- before_run_callback → register task as RUNNING
- after_run_callback → mark COMPLETED/FAILED
- before_agent_callback → check abort signal
- on_model_error_callback → handle crash recovery

Impact on your work

I run multi-agent systems in managing infrastructure via Telegram, Discord, Slack, and web interfaces. Without these primitives, every production ADK deployment needs to build them from scratch. I've already built and battle-tested this with different frame works with 100 concurrent agents across 10,000 rounds. See my github

The agent safety model (dual-gate firewall with deterministic denylist + independent LLM judge) is documented in a separate IETF Internet-Draft:  https://datatracker.ietf.org/doc/draft-baysal-asimov-safety-architecture/

Willingness to contribute

Yes. I have a working implementation ready to port to the adk-python. This addresses #2524, #4796, and #1621.

---
Describe Alternatives You've Considered

- SQLite/Database session service — exists but doesn't provide the low-latency Pub/Sub needed for real-time abort signals
- InMemory session service — lost on restart, not suitable for production
- Building it outside ADK — works but fragments the ecosystem. This belongs in the framework.

Proposed API / Implementation

# Session service
from google.adk.sessions import RedisSessionService

session_service = RedisSessionService(
    redis_url="redis://localhost:6379",
    key_prefix="adk:",
    session_ttl=3600,
)

# Orchestration plugin
from google.adk.plugins import RedisOrchestrationPlugin

plugin = RedisOrchestrationPlugin(
    redis_url="redis://localhost:6379",
    enable_state_tracking=True,
    enable_abort=True,
    enable_crash_recovery=True,
)

# App with both
from google.adk.apps import App

app = App(
    name="my_app",
    agent=my_agent,
    plugins=[plugin],
)

# Runner
runner = Runner(
    app=app,
    session_service=session_service,
)

# External abort from anywhere
import redis
r = redis.from_url("redis://localhost:6379")
r.publish("adk:abort:task-123", '{"action": "abort"}')


Additional Context

- Addresses three open issues: #2524 (Redis memory/session), #4796 (stop run_async externally), #1621 (terminate conversation)
- Battle-tested at scale: 100 concurrent agents, 10,000 rounds, with HMAC-signed results
- Zero new dependencies beyond redis-py (optional install)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redis-backed session service and orchestration plugin (state tracking, abort, crash recovery) #5048

Session service

Orchestration plugin

App with both

Runner

External abort from anywhere

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Redis-backed session service and orchestration plugin (state tracking, abort, crash recovery) #5048

Description

Session service

Orchestration plugin

App with both

Runner

External abort from anywhere

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions