feat(agent): add configurable retry_strategy for model calls#1424
Merged
zastrowm merged 17 commits intostrands-agents:mainfrom Jan 21, 2026
Merged
feat(agent): add configurable retry_strategy for model calls#1424zastrowm merged 17 commits intostrands-agents:mainfrom
zastrowm merged 17 commits intostrands-agents:mainfrom
Conversation
Refactored hardcoded retry logic in event_loop into a flexible, hook-based retry system that allows users to customize retry behavior.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
zastrowm
added a commit
to zastrowm/docs
that referenced
this pull request
Jan 8, 2026
…igh-Level constructs In doing api bar raising for strands-agents/sdk-python/pull/1424, we determined that HookProvider is a too-low-level interface for exposing directly to integrators. This captures that decision & reasoning in log format and sets us up to record future decisions in a similar way going forward. See DECISIONS.md on the decision & the format
4 tasks
zastrowm
added a commit
to zastrowm/docs
that referenced
this pull request
Jan 14, 2026
…igh-Level constructs In doing api bar raising for strands-agents/sdk-python/pull/1424, we determined that HookProvider is a too-low-level interface for exposing directly to integrators. This captures that decision & reasoning in log format and sets us up to record future decisions in a similar way going forward. See DECISIONS.md on the decision & the format
Enforces that Agent only accepts ModelRetryStrategy instances (not subclasses) for the retry_strategy parameter to prevent API confusion before a base RetryStrategy class is introduced.
pgrayy
reviewed
Jan 15, 2026
pgrayy
reviewed
Jan 15, 2026
# Conflicts: # src/strands/agent/agent.py
Unshure
reviewed
Jan 15, 2026
Contributor
strands-agent
left a comment
There was a problem hiding this comment.
The type check for retry_strategy parameter is too restrictive and will break for subclasses:
if retry_strategy and type(retry_strategy) is not ModelRetryStrategy:
raise ValueError("retry_strategy must be an instance of ModelRetryStrategy")This uses type() with is not, which fails for subclasses. Consider:
class MyCustomRetry(ModelRetryStrategy):
# Custom retry logic
pass
agent = Agent(retry_strategy=MyCustomRetry()) # ❌ Raises ValueError!Recommendation: Use isinstance() check instead:
if retry_strategy is not None and not isinstance(retry_strategy, HookProvider):
raise TypeError(f"retry_strategy must implement HookProvider, got {type(retry_strategy).__name__}")This allows:
- Subclasses of ModelRetryStrategy ✅
- Custom HookProvider implementations ✅
- Better error message with actual type ✅
- Type safety with proper inheritance check ✅
🤖 This is an experimental AI agent response from the Strands team, powered by Strands Agents. We're exploring how AI agents can help with community support and development. Your feedback helps us improve! If you'd prefer human assistance, please let us know.
pgrayy
previously approved these changes
Jan 20, 2026
mkmeral
reviewed
Jan 20, 2026
Unshure
previously approved these changes
Jan 20, 2026
mkmeral
approved these changes
Jan 20, 2026
pgrayy
approved these changes
Jan 21, 2026
7 tasks
manoj-selvakumar5
pushed a commit
to manoj-selvakumar5/strands-sdk-python
that referenced
this pull request
Feb 5, 2026
…-agents#1424) The current retry logic for handling ModelThrottledException is hardcoded in event_loop.py with fixed values (6 attempts, exponential backoff starting at 4s). This makes it impossible for users to customize retry behavior for their specific use cases, such as: This refactors the hardcoded retry logic into a `ModelRetryStrategy` class so that folks can customize the parameters. **Not Included**: The does not introduce a `RetryStrategy` base class. I started to do so, but am deferring it because: 1. It requires some additional design work to accommodate the tool-retries, which I anticipate should be accounted for in the design 2. It simplifies this review which refactors how the default retries work internally 3. `ModelRetryStrategy` provides enough benefit to allow folks to customize the agent loop without blocking on a more extensible design ---- Co-authored-by: Strands Agent <strands-agent@users.noreply.github.com> Co-authored-by: Mackenzie Zastrow <zastrowm@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The current retry logic for handling ModelThrottledException is hardcoded in event_loop.py with fixed values (6 attempts, exponential backoff starting at 4s). This makes it impossible for users to customize retry behavior for their specific use cases, such as:
This PR refactors the hardcoded retry logic into a
ModelRetryStrategyclass so that folks can customize the parameters.Not Included:
The PR does not introduce a
RetryStrategybase class. I started to do so, but am deferring it because:ModelRetryStrategyprovides enough benefit to allow folks to customize the agent loop without blocking on a more extensible designPublic API Changes
Added a new
retry_strategyparameter toAgent.__init__():The
retry_strategyparameter only acceptsModelRetryStrategy, no derived classes. We've been discussing aRetryStrategybase class that is more abstract and supports additional exception types, but I'm punting on that as it requires additional design work whereas this provides immediate benefit to callers attempting to custimize the current agent-loop retry behavior.For now, alternative retry strategies can be implemented by creating a hook provider that sets
event.retry = Trueon theAfterModelCallEventwhen a retry should occur.Backwards Compatibility
Retry delay
The general default behavior is unchanged — agents still retry up to 5 times (6 attempts in total) with the same exponential backoff. The
EventLoopThrottleEventandForceStopEventare still emitted during retries, maintaining backwards compatibility with existing hooks that listen for this event.The exact delay times have changed!. Because of a bug in the original logic, the initial delay was actually doubled the first time it executed (see
test_agent_events.pyfor the test changes to accomidate this). Previous to these changes, the delay(s) were:Afer these changes, the delays are:
I think this are okay changes to make, however.
Default retry behavior
The default retry behavior also reads from
event_loop.MAX_ATTEMPTSetc so that anyone who was previously modifying those constants will continue to do soImplementation Decisions
EventLoopThrottleEventandForceStopEventevents as we used to.ForceStopEventwhenever an exception bubbles out of the model invocationretry_strategyso that as hooks are expanded to allow retrying tools, we can also enable tool retry strategiesModelRetryStrategysince it's only focused on model retries - in the future we might vend other strategies, but we can add a new strategy rather than attempting to fit it all into this one.Related Issues
Documentation PR
https://github.com/strands-agents/docs/pull/455/changes
Type of Change
New feature
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli
hatch run prepareChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.