Skip to content

feat(bedrock): add automatic prompt caching support#1438

Merged
Unshure merged 4 commits intostrands-agents:mainfrom
kevmyung:feat/prompt-caching
Jan 27, 2026
Merged

feat(bedrock): add automatic prompt caching support#1438
Unshure merged 4 commits intostrands-agents:mainfrom
kevmyung:feat/prompt-caching

Conversation

@kevmyung
Copy link
Contributor

@kevmyung kevmyung commented Jan 8, 2026

Summary

  • Add CacheConfig with strategy="auto" for automatic prompt caching in BedrockModel
  • Cache points are injected at the end of the last assistant message before each model call
  • Supports all Claude models on Bedrock that have prompt caching capability

Usage

from strands import Agent
from strands.models import BedrockModel, CacheConfig

model = BedrockModel(
    model_id="us.anthropic.claude-sonnet-4-5-20250929-v1:0",
    cache_config=CacheConfig(strategy="auto")
)
agent = Agent(model=model)

Test plan

  • Unit tests for cache point injection logic
  • Integration test with Claude models on Bedrock confirming cache hits

Closes #1432

@azaylamba
Copy link

azaylamba commented Jan 12, 2026

This is an important change, thanks for this.

@strands-agent
Copy link
Contributor

🎯 Review - Automatic Prompt Caching

Excellent implementation of automatic prompt caching for Bedrock! This addresses #1432 nicely and will provide significant performance and cost benefits for multi-turn conversations.

What I Really Like ✅

  1. Smart Cache Point Strategy: The _inject_cache_point() logic is elegant - automatically moving the cache point to the last assistant message ensures optimal cache utilization without manual management.

  2. Comprehensive Tests: 155 lines of new tests in test_bedrock.py covering edge cases like:

    • Cache point injection
    • Cache point movement across turns
    • Cleanup of stale cache points
  3. Backward Compatibility: Deprecating cache_prompt while still supporting it shows good API stewardship.

  4. Documentation: Clear docstrings and usage examples in the PR description.

Minor Suggestions 💡

1. Cache Point Detection Could Be More Explicit

In _inject_cache_point() around line 344, the logic for detecting an existing cache point at the right position relies on the loop continuing. Consider making this more explicit:

# Check if cache point was already found at the right position
last_assistant_content = messages[last_assistant_idx]["content"]
if last_assistant_content and "cachePoint" in last_assistant_content[-1]:
    logger.debug(f"Cache point already exists at end of last assistant message {last_assistant_idx}")
    return

# Add cache point at the end of the last assistant message
last_assistant_content.append({"cachePoint": {"type": "default"}})
logger.debug(f"Added cache point at end of assistant message {last_assistant_idx}")

2. Model Support Detection

The supports_caching property checks for "claude" or "anthropic" in the model ID. Consider if future Bedrock models might support caching:

@property
def supports_caching(self) -> bool:
    """Whether this model supports prompt caching.
    
    Returns True for Claude models on Bedrock that support caching.
    Add other models as they become available.
    """
    model_id = self.config.get("model_id", "").lower()
    # Claude 3 Opus, Sonnet, and Haiku on Bedrock support caching
    # Claude 3.5 Sonnet and later also support caching
    return "claude" in model_id or "anthropic" in model_id

3. Integration Test Clarity

The integration tests are great! One suggestion - add a comment explaining the cache hit verification:

# After second call, verify cache hit (cache_read_input_tokens > 0)
# This confirms the cache point strategy is working
assert result.metadata["converse_metrics"]["cache_read_input_tokens"] > 0

Questions for Discussion 🤔

  1. Cache Invalidation: What happens if the system prompt or tools change between calls? Does the cache automatically invalidate, or should there be explicit cache busting?

  2. Multiple Cache Points: The current strategy uses a single cache point. Are there scenarios where multiple cache points (e.g., one for system prompt, one for conversation) would be beneficial?

  3. Performance Metrics: Would it be valuable to expose cache hit/miss metrics in the AgentResult.metadata for users to monitor cache effectiveness?

CI Status

I see CI is still pending. Once it passes, this looks ready for maintainer review!

Overall Assessment

This is a high-quality PR that will be valuable for the community. The automatic cache management removes complexity from users while providing real performance benefits. Great work, @kevmyung! 🎉

🦆


🤖 This is an experimental AI agent response from the Strands team, powered by Strands Agents. We're exploring how AI agents can help with community support and development. Your feedback helps us improve! If you'd prefer human assistance, please let us know.

Copy link
Member

@Unshure Unshure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, just need to rebase and address the nit.

Add CacheConfig with strategy="auto" for BedrockModel to automatically
inject cache points at the end of assistant messages in multi-turn
conversations.

- Add CacheConfig dataclass in model.py with strategy field
- Add supports_caching property to check Claude model compatibility
- Implement _inject_cache_point() for automatic cache point management
- Export CacheConfig from models/__init__.py

Closes strands-agents#1432
- Add warning when cache_config enabled but model doesn't support caching
- Make supports_caching private (_supports_caching)
- Fix log formatting to follow style guide
- Clean up tests and imports
- Inject into cleaned_messages instead of original messages to avoid mutating agent.messages
- Use reversed iteration for safe in-place deletion
- Consolidate redundant cache point tests
@kevmyung kevmyung force-pushed the feat/prompt-caching branch from 9e23a68 to 2a498bc Compare January 25, 2026 20:28
@github-actions github-actions bot added size/m and removed size/m labels Jan 25, 2026
@codecov
Copy link

codecov bot commented Jan 26, 2026

Codecov Report

❌ Patch coverage is 93.75000% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/strands/models/bedrock.py 92.30% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@Unshure Unshure merged commit 138750c into strands-agents:main Jan 27, 2026
29 of 67 checks passed
minorun365 added a commit to minorun365/sdk-python that referenced this pull request Jan 28, 2026
Add test to verify cachePoint blocks are formatted as standalone blocks
and not merged into previous content blocks. This confirms the fix from
PR strands-agents#1438 works correctly for the scenario reported in Issue strands-agents#1219.

Closes strands-agents#1219
gaurav71531 pushed a commit to gaurav71531/sdk-python that referenced this pull request Feb 4, 2026
manoj-selvakumar5 pushed a commit to manoj-selvakumar5/strands-sdk-python that referenced this pull request Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add cache_strategy="auto" for automatic prompt caching

5 participants