-
-
Notifications
You must be signed in to change notification settings - Fork 763
feat(guardrails): add error taxonomy and provider-agnostic guardrails #5963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(guardrails): add error taxonomy and provider-agnostic guardrails #5963
Conversation
|
Hey @mldangelo , this is ready for review! |
📝 WalkthroughWalkthroughThis PR introduces a comprehensive error handling and retry infrastructure. It defines structured error types (ErrorCategory, ErrorType, EvalErrorInfo) and result types (GradingResult, CaseResult, AssertionDetail) for standardized evaluation output. A new AppError class provides structured error information with helper constructors for common error categories. A guardrails module implements a retry mechanism with jittered exponential backoff, timeout control via AbortController, and special handling for rate limits. Minor dependency updates include removing peerDependencies and updating prettier version spec. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes The review requires understanding retry logic with exponential backoff and timeout handling in the guardrails module, verifying type consistency across multiple new files, and validating error propagation patterns through helper constructors. While the changes are heterogeneous—spanning types, errors, and guardrails—each component is self-contained with no existing code modifications, keeping the effort in the moderate range. Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (7)
src/types/errors.ts (1)
1-8: Consider typing code with ErrorCategory for stronger guarantees.If ErrorCategory represents standardized error codes, wire it into EvalErrorInfo.code (in src/types/result.ts) to prevent free-form strings. Keep this file as a thin re-export hub to avoid duplication.
src/types/result.ts (1)
5-11: Type EvalErrorInfo.code with ErrorCategory (instead of string).Improves safety and keeps taxonomy consistent with src/types/errors.ts.
Apply:
+import type { ErrorCategory } from './errors'; export interface EvalErrorInfo { type: ErrorType; - code?: string; + code?: ErrorCategory; message: string; hint?: string; provider?: string;src/lib/errors.ts (1)
22-32: Include stack/cause in toInfo for parity with EvalErrorInfo.Restores fidelity for CLI/JSON outputs and debugging.
Apply:
toInfo(): EvalErrorInfo { return { type: this.type, code: this.code, message: this.message, hint: this.hint, provider: this.provider, requestId: this.requestId, raw: this.raw, + stack: this.stack, + // Avoid leaking objects; only surface string causes + cause: typeof this.cause === 'string' ? this.cause : undefined, }; }src/lib/guardrails (4)
42-49: Use unknown in catch and clear the timer immediately.Avoid any; clear the timer before heavy work to prevent leaks.
Apply:
- try { + try { const out = await call(args, controller.signal); clearTimeout(timeout); return out; - } catch (err: any) { + } catch (err: unknown) { + clearTimeout(timeout); attempt++;
51-61: Harden error extraction and retryability checks.Broaden network error codes and parse status from common SDK shapes.
Apply:
- const code = err?.code ?? err?.status ?? err?.name; - const status = err?.status as number | undefined; - const retryAfterHeader = err?.retryAfterSeconds; - const retryAfter = Number(retryAfterHeader); + const anyErr = err as Record<string, any> | undefined; + const code = anyErr?.code ?? anyErr?.name ?? anyErr?.status ?? anyErr?.cause?.code; + const status: number | undefined = + anyErr?.status ?? + anyErr?.statusCode ?? + anyErr?.response?.status ?? + anyErr?.response?.statusCode; + // Retry-After: seconds or HTTP-date; look in common locations + const raHeader = + anyErr?.retryAfterSeconds ?? + anyErr?.response?.headers?.['retry-after'] ?? + anyErr?.headers?.['retry-after']; + const retryAfter = + typeof raHeader === 'string' && /^\d+$/.test(raHeader) + ? Number(raHeader) + : Number(raHeader); // best effort fallback
78-85: Respect Retry-After more robustly and avoid over-sleeping.Use parsed retryAfter when present; otherwise exponential backoff with jitter.
Apply:
- let delay = Math.min(max, base * 2 ** (attempt - 1)); - if (respectRetryAfter && Number.isFinite(retryAfter) && retryAfter! > 0) { - delay = Math.max(delay, retryAfter! * 1000); - } - await sleep(jitter(delay)); - clearTimeout(timeout); + let delay = Math.min(max, base * 2 ** (attempt - 1)); + if (respectRetryAfter && Number.isFinite(retryAfter) && retryAfter > 0) { + delay = Math.max(delay, retryAfter * 1000); + } + await sleep(jitter(delay));
65-73: Sanitize raw error details before attaching.Error messages can leak tokens/PII; consider redaction before adding to raw.
Example:
const msg = String((anyErr as any)?.message ?? ''); const safeMsg = msg.replace(/(sk-[A-Za-z0-9]{20,})/g, '[REDACTED]'); raw: { status, code, message: safeMsg }As per coding guidelines.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (5)
package.json(1 hunks)src/lib/errors.ts(1 hunks)src/lib/guardrails(1 hunks)src/types/errors.ts(1 hunks)src/types/result.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)
Prefer not to introduce new TypeScript types; use existing interfaces whenever possible
**/*.{ts,tsx}: Use TypeScript with strict type checking
Follow consistent import order (Biome will handle import sorting)
Use consistent curly braces for all control statements
Prefer const over let; avoid var
Use object shorthand syntax whenever possible
Use async/await for asynchronous code
Use consistent error handling with proper type checks
Always sanitize sensitive data before logging
Use structured logger methods (debug/info/warn/error) with a context object instead of interpolating secrets into log strings
Use sanitizeObject for manual sanitization in non-logging contexts before persisting or further processing data
Files:
src/types/errors.tssrc/lib/errors.tssrc/types/result.ts
{package.json,src/app/package.json,site/package.json,examples/**/package.json}
📄 CodeRabbit inference engine (CLAUDE.md)
PeerDependencies must match devDependencies versions
Files:
package.json
{package.json,src/app/package.json,site/package.json}
📄 CodeRabbit inference engine (CLAUDE.md)
Use CommonJS modules (set "type": "commonjs" in package.json)
Files:
package.json
🧬 Code graph analysis (3)
src/types/errors.ts (1)
src/types/result.ts (1)
EvalErrorInfo(5-15)
src/lib/errors.ts (2)
src/types/result.ts (2)
ErrorType(2-2)EvalErrorInfo(5-15)src/types/errors.ts (1)
EvalErrorInfo(10-15)
src/types/result.ts (1)
src/types/errors.ts (1)
EvalErrorInfo(10-15)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
- GitHub Check: Redteam (Production API)
- GitHub Check: Run Integration Tests
- GitHub Check: webui tests
- GitHub Check: Test on Node 24.x and windows-latest
- GitHub Check: Generate Assets
- GitHub Check: Test on Node 22.x and ubuntu-latest
- GitHub Check: Test on Node 22.x and windows-latest
- GitHub Check: Build Docs
- GitHub Check: Test on Node 22.x and macOS-latest
- GitHub Check: Test on Node 24.x and ubuntu-latest
- GitHub Check: Test on Node 20.x and ubuntu-latest
- GitHub Check: Test on Node 20.x and macOS-latest
- GitHub Check: Test on Node 20.x and windows-latest
- GitHub Check: Build on Node 24.x
- GitHub Check: Build on Node 20.x
- GitHub Check: Build on Node 22.x
- GitHub Check: Style Check
🔇 Additional comments (2)
src/types/result.ts (1)
25-51: Result/grading shapes look coherent and extensible.No issues spotted. The structure aligns with Promptfoo’s output schema.
package.json (1)
153-153: Prettier bump verified; fix CommonJS type in site and src/app packages.Workspace alignment is sound—Prettier only declared at root (no sub-package conflicts). The peerDependencies concern is moot; none exist across the workspace.
However, the coding guidelines require
"type": "commonjs"insite/package.jsonandsrc/app/package.json. Currently:
site/package.json: missing"type"fieldsrc/app/package.json: has"type": "module"(should be"commonjs")Add or correct these fields in both files.
⛔ Skipped due to learnings
Learnt from: CR PR: promptfoo/promptfoo#0 File: CLAUDE.md:0-0 Timestamp: 2025-10-11T15:29:11.784Z Learning: Applies to {package.json,src/app/package.json,site/package.json} : Use CommonJS modules (set "type": "commonjs" in package.json)
| constructor(info: EvalErrorInfo) { | ||
| super(info.message); | ||
| this.name = 'AppError'; | ||
| this.type = info.type; | ||
| this.code = info.code; | ||
| this.hint = info.hint; | ||
| this.provider = info.provider; | ||
| this.requestId = info.requestId; | ||
| this.raw = info.raw; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Fix Error subclassing and propagate stack/cause.
Ensure correct instanceof and preserve diagnostic fields.
Apply:
export class AppError extends Error {
type: ErrorType;
code?: string;
hint?: string;
provider?: string;
requestId?: string;
raw?: unknown;
+ // Node supports Error.cause; keep typed as unknown
+ cause?: unknown;
constructor(info: EvalErrorInfo) {
- super(info.message);
+ super(info.message);
+ // Ensure correct prototype chain for Error subclassing
+ Object.setPrototypeOf(this, new.target.prototype);
this.name = 'AppError';
this.type = info.type;
this.code = info.code;
this.hint = info.hint;
this.provider = info.provider;
this.requestId = info.requestId;
this.raw = info.raw;
+ // Preserve diagnostics when provided
+ if (info.stack) this.stack = info.stack;
+ if (info.cause !== undefined) this.cause = info.cause;
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/lib/errors.ts around lines 11 to 20, the Error subclass constructor must
ensure proper instanceof behavior and propagate stack and cause; update the
constructor to call super(info.message, { cause: info.cause }) when available,
set this.name, assign diagnostic fields (type, code, hint, provider, requestId,
raw) as now, and explicitly restore the prototype with
Object.setPrototypeOf(this, new.target.prototype) (or AppError.prototype) so
instanceof works; if info.raw or info.raw.stack exists, copy that stack to
this.stack to preserve original stack trace.
src/lib/guardrails
Outdated
| import { providerError } from './errors'; | ||
| import type { EvalErrorInfo } from '../types/result'; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Import timeoutError for proper taxonomy.
Timeouts should map to type 'timeout', not 'provider_error'.
Apply:
-import { providerError } from './errors';
+import { providerError, timeoutError } from './errors';
-import type { EvalErrorInfo } from '../types/result';
+import type { EvalErrorInfo } from '../types/result';📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import { providerError } from './errors'; | |
| import type { EvalErrorInfo } from '../types/result'; | |
| import { providerError, timeoutError } from './errors'; | |
| import type { EvalErrorInfo } from '../types/result'; | |
🤖 Prompt for AI Agents
In src/lib/guardrails around lines 1 to 3, the file imports providerError but
does not import timeoutError, so timeout conditions are being classified as
'provider_error' instead of 'timeout'; add an import for timeoutError from
'./errors' and update any places that currently map timeout conditions to
providerError to use timeoutError instead so timeouts are reported with the
'timeout' taxonomy.
| const aborted = controller.signal.aborted; | ||
| const retryable = | ||
| aborted || | ||
| code === 'ECONNRESET' || code === 'ETIMEDOUT' || | ||
| status === 429 || (status && status >= 500 && status < 600); | ||
|
|
||
| if (!retryable || attempt > retryBudget) { | ||
| clearTimeout(timeout); | ||
| // map final failure to provider_error | ||
| throw providerError( | ||
| `Provider request failed after ${attempt} attempt(s)`, | ||
| { | ||
| provider: providerName, | ||
| code: typeof code === 'string' ? code : String(code ?? status ?? 'UNKNOWN'), | ||
| raw: { status, code, message: err?.message }, | ||
| hint: status === 429 | ||
| ? 'Hit rate limit. Lower concurrency or enable caching.' | ||
| : 'Check network/credentials or retry later.', | ||
| } as Partial<EvalErrorInfo> | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Classify final timeouts as timeoutError (not provider_error).
Aligns with the stated taxonomy and improves UX/metrics.
Apply:
- const aborted = controller.signal.aborted;
+ const aborted = controller.signal.aborted;
+ const isTimeoutLike =
+ aborted ||
+ code === 'ETIMEDOUT' ||
+ code === 'UND_ERR_CONNECT_TIMEOUT' ||
+ anyErr?.name === 'AbortError';
const retryable =
- aborted ||
- code === 'ECONNRESET' || code === 'ETIMEDOUT' ||
+ isTimeoutLike ||
+ code === 'ECONNRESET' || code === 'ECONNREFUSED' || code === 'EAI_AGAIN' || code === 'ENOTFOUND' ||
status === 429 || (status && status >= 500 && status < 600);
if (!retryable || attempt > retryBudget) {
- clearTimeout(timeout);
- // map final failure to provider_error
- throw providerError(
- `Provider request failed after ${attempt} attempt(s)`,
- {
- provider: providerName,
- code: typeof code === 'string' ? code : String(code ?? status ?? 'UNKNOWN'),
- raw: { status, code, message: err?.message },
- hint: status === 429
- ? 'Hit rate limit. Lower concurrency or enable caching.'
- : 'Check network/credentials or retry later.',
- } as Partial<EvalErrorInfo>
- );
+ // Map final failure to the correct category
+ if (isTimeoutLike) {
+ throw timeoutError(
+ `Provider request timed out after ${attempt} attempt(s)`,
+ {
+ provider: providerName,
+ code: typeof code === 'string' ? code : String(code ?? status ?? 'TIMEOUT'),
+ raw: { status, code, message: (anyErr as any)?.message },
+ hint: 'Increase timeoutMs or reduce payload/concurrency.',
+ } as Partial<EvalErrorInfo>,
+ );
+ }
+ throw providerError(
+ `Provider request failed after ${attempt} attempt(s)`,
+ {
+ provider: providerName,
+ code: typeof code === 'string' ? code : String(code ?? status ?? 'UNKNOWN'),
+ raw: { status, code, message: (anyErr as any)?.message },
+ hint: status === 429
+ ? 'Hit rate limit. Lower concurrency or enable caching.'
+ : 'Check network/credentials or retry later.',
+ } as Partial<EvalErrorInfo>,
+ );
}Committable suggestion skipped: line range outside the PR's diff.
src/types/errors.ts
Outdated
| export interface EvalErrorInfo { | ||
| type: string; | ||
| message: string; | ||
| code?: string; | ||
| providerId?: string; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate and conflicting EvalErrorInfo — unify to a single source of truth.
This interface conflicts with src/types/result.ts (different fields: providerId vs provider, missing hint/requestId/raw/stack/cause). This will cause drift and accidental misuse.
Apply:
export type ErrorCategory =
| 'invalid_api_key'
| 'quota_exceeded'
| 'rate_limited'
| 'not_found'
| 'timeout'
| 'internal'
| 'network';
-export interface EvalErrorInfo {
- type: string;
- message: string;
- code?: string;
- providerId?: string;
-}
+// Re-export the canonical shape to avoid divergence
+export type { EvalErrorInfo } from './result';📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export interface EvalErrorInfo { | |
| type: string; | |
| message: string; | |
| code?: string; | |
| providerId?: string; | |
| } | |
| export type ErrorCategory = | |
| | 'invalid_api_key' | |
| | 'quota_exceeded' | |
| | 'rate_limited' | |
| | 'not_found' | |
| | 'timeout' | |
| | 'internal' | |
| | 'network'; | |
| // Re-export the canonical shape to avoid divergence | |
| export type { EvalErrorInfo } from './result'; |
🤖 Prompt for AI Agents
In src/types/errors.ts around lines 10-15, the EvalErrorInfo interface
duplicates and conflicts with the one in src/types/result.ts (mismatched field
names and missing fields); remove this duplicate and unify to a single source of
truth by importing and re-exporting the canonical interface from
src/types/result.ts (or move the canonical definition into a common file used by
both modules), update all local references to use the unified type, and ensure
the final interface includes the superset of fields (e.g., type, message, code,
provider/providerId, hint, requestId, raw, stack, cause) so no callers break.
src/lib/guardrails
Outdated
| @@ -0,0 +1,88 @@ | |||
| import { providerError } from './errors'; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is missing an extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@will thanks for catching that, I cleaned it up and it’s ready for another look whenever you get a chance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no problem, I do what I can
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@will ha, sorry about that. I meant to tag @will-holley. thanks for catching that, I cleaned it up and it’s ready for another look whenever you get a chance
19a99f2 to
46498c5
Compare
46498c5 to
d861896
Compare
…rrel; align AppError.code
feat(errors): add structured error taxonomy & provider guardrails
Summary
Implements an error taxonomy and provider-agnostic guardrail utilities.
Provides consistent error handling and retry logic across all providers.
Details
ErrorCategoryandClassifiedErrortypes.withRetrieshelper and configurableGuardrailConfig.Key changes
provider_error— e.g., invalid API keyvalidation_error— bad YAML/config shapetimeout— request/provider timeouttool_error— local runtime/tooling issueAppErrorhelpers; errors now carry{ type, code, message, provider }used by CLI/JSON outputs.Tests
Verified locally on Node 20:
Cleanup & Follow-ups
Notes
npm rebuild better-sqlite3 --force).--strict-validateflag to fail on coercions or 0-test configs.