Summarization middleware for automatic and tool-based conversation compaction.
This module provides two middleware classes and a convenience factory:
SummarizationMiddleware — automatically compacts the conversation when token
usage exceeds a configurable threshold.
Older messages are summarized via an LLM call and the full history is offloaded to a backend for later retrieval.
SummarizationToolMiddleware — exposes a compact_conversation tool that
lets the agent (or a human-in-the-loop approval flow) trigger compaction on
demand.
Composes with a SummarizationMiddleware instance and reuses its
summarization engine.
create_summarization_tool_middleware — convenience factory that creates both
middleware layers with model-aware defaults.
from deepagents import create_deep_agent
from deepagents.middleware.summarization import (
SummarizationMiddleware,
SummarizationToolMiddleware,
)
from deepagents.backends import FilesystemBackend
backend = FilesystemBackend(root_dir="/data")
summ = SummarizationMiddleware(
model="gpt-4o-mini",
backend=backend,
trigger=("fraction", 0.85),
keep=("fraction", 0.10),
)
tool_mw = SummarizationToolMiddleware(summ)
agent = create_deep_agent(middleware=[summ, tool_mw])
Offloaded messages are stored as markdown at /conversation_history/{thread_id}.md.
Each summarization event appends a new section to this file, creating a running log of all evicted messages.
Public alias for _DeepAgentsSummarizationMiddleware.
This is the name external callers should import and reference.
Append text to a system message.
Compute default summarization settings based on model profile.
Create a SummarizationMiddleware with model-aware defaults.
Computes trigger, keep, and truncation settings from the model's profile (or uses fixed-token fallbacks) and returns a configured middleware.
Create a SummarizationToolMiddleware with model-aware defaults.
Convenience factory that creates a SummarizationMiddleware via
create_summarization_middleware and wraps it in a
SummarizationToolMiddleware.
Routes file operations to different backends by path prefix.
Matches paths against route prefixes (longest first) and delegates to the corresponding backend. Unmatched paths use the default backend.
Protocol for pluggable memory backends (single, unified).
Backends can store files in different locations (state, filesystem, database, etc.) and provide a uniform interface for file operations.
All file data is represented as dicts with the following structure::
{
"content": str, # Text content (utf-8) or base64-encoded binary
"encoding": str, # "utf-8" for text, "base64" for binary data
"created_at": str, # ISO format timestamp
"modified_at": str, # ISO format timestamp
}
Input schema for the compact_conversation tool.
Represents a summarization event.
Settings for truncating large tool-call arguments in older messages.
This is a lightweight, pre-summarization optimization that fires at a lower
token threshold than full conversation compaction. When triggered, only the
args values on AIMessage.tool_calls in messages before the keep window
are shortened — recent messages are left intact.
Typical large arguments include write_file content, edit_file patches,
and verbose execute outputs.
State for the summarization middleware.
Extends AgentState with a private field for tracking summarization events.
Default settings computed from model profile.
Middleware that provides a compact_conversation tool for manual compaction.
This middleware composes with a SummarizationMiddleware instance, reusing
its summarization engine (model, backend, trigger thresholds) to let the
agent compact its own context window.
This middleware never compacts automatically. Compaction only occurs when
compact_conversation is called as a normal tool call (by the model or by
an explicit user action, e.g. as implemented in the deepagents-cli).
To avoid compacting too early, compact tool execution is gated by
_is_eligible_for_compaction, which requires reported usage to reach about
50% of the configured auto-summarization trigger.
The tool and auto-summarization share the same _summarization_event state
key, so they interoperate correctly.
For a simpler setup, use create_summarization_tool_middleware which
handles both steps.