feat(chatd): add LLM stream retry with exponential backoff#22418
Merged
feat(chatd): add LLM stream retry with exponential backoff#22418
Conversation
Adds automatic retry with exponential backoff for transient LLM errors during chat streaming and title generation. Inspired by coder/mux's retry mechanism. Key behaviors: - Infinite retries with exponential backoff: 1s, 2s, 4s, ..., 60s cap - Deterministic delays (no jitter) - Error classification: retryable (429, 5xx, overloaded, rate limit, network errors) vs non-retryable (auth, quota, context exceeded, model not found, canceled) - Retry status published to SSE stream so frontend can show "Retrying in Xs..." UI - Title generation retries silently (best-effort) New package: coderd/chatd/chatretry/ - classify.go: IsRetryable() and StatusCodeRetryable() - backoff.go: Delay() with exponential doubling and 60s cap - retry.go: Retry() infinite loop with context-aware timer Test helpers: coderd/chatd/chattest/errors.go - Anthropic and OpenAI error response builders for testing 42 tests covering classification, backoff, and retry scenarios.
Consumes the 'retry' SSE event in the ChatContext store and displays 'Thinking... attempt N' in the streaming placeholder when the server is retrying a failed LLM call. The attempt indicator uses a muted style next to the shimmer text. Changes: - ChatContext.ts: add retryState to store, handle 'retry' SSE events, clear retry state on status transitions - AgentDetail.tsx: thread retryState to ConversationTimeline - ConversationTimeline.tsx: export StreamingOutput, add retryState prop, render 'attempt N' next to shimmer - StreamingOutput.stories.tsx: 6 stories covering placeholder, retry attempts 1/3/12, streaming text, and post-retry states
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds automatic retry with exponential backoff for transient LLM errors during chat streaming and title generation. Inspired by coder/mux's retry mechanism.
Key Behaviors
New Package:
coderd/chatd/chatretry/classify.goIsRetryable(err)andStatusCodeRetryable(code)backoff.goDelay(attempt)— exponential doubling with 60s capretry.goRetry(ctx, fn, onRetry)— infinite loop with context-aware timerTest Helpers:
coderd/chatd/chattest/errors.goAnthropic and OpenAI error response builders for use in chattest providers:
AnthropicErrorResponse(),AnthropicOverloadedResponse(),AnthropicRateLimitResponse()OpenAIErrorResponse(),OpenAIRateLimitResponse(),OpenAIServerErrorResponse()SDK Changes:
codersdk/chats.goChatStreamEventType: "retry"ChatStreamRetrystruct withAttempt,DelayMs,Error,RetryingAtfieldsChanged Files
coderd/chatd/chatloop/chatloop.go— wrapsagent.Stream()inchatretry.Retry()coderd/chatd/chatd.go— publishes retry events to SSE stream with loggingcoderd/chatd/title.go— wrapsmodel.Generate()in silent retrycoderd/chatd/chattest/anthropic.go/openai.go— error injection supportTests
42 tests covering classification (33), backoff (9), and retry scenarios (8).