Skip to content

Conversation

@will-holley
Copy link
Contributor

@will-holley will-holley commented Oct 23, 2025

image

Features

  • Regeneration
  • Handles single turn strategies
  • For strategies requiring config, validates config has been provided
  • Renders previews for image, audio, and video strategies
  • Handles multi-turn strategies

- Removes values unnecessarily (no refs) returned by the provider
- Adds support for generating w/ strategies
- Request body validation
- Fixes types
- Removes unreachable logic
- Adds type-support for nonce
@will-holley will-holley marked this pull request as ready for review October 23, 2025 18:47
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 23, 2025

📝 Walkthrough

Walkthrough

This pull request refactors the red team test case generation system to use structured plugin and strategy objects. It introduces a TestCaseGenerationProvider context wrapper around the main Strategies component, replaces loose string/config parameters with explicit {id, config} object shapes for plugins and strategies, converts type definitions in redteam/types.ts to zod-based schemas with validation, updates the /generate-test API endpoint to validate and process the new object formats, and integrates test case generation capabilities into individual strategy items with dedicated UI buttons. TestCaseDialog is updated to accept and display both plugin and strategy information. The backend applies strategy factories post-generation to transform test cases based on selected strategies.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

This change involves heterogeneous modifications across multiple architectural layers. The PR introduces new public API signatures (TestCaseDialogProps, TestCaseGenerationContextValue, StrategyCardData.id), converts type definitions to zod-based schemas requiring validation logic review, substantially modifies the /generate-test endpoint to handle structured payloads and strategy application, and refactors component integration patterns with the new provider wrapper. While individual changes follow consistent patterns (structured objects replacing loose configs), the breadth of affected files, density of type system changes, and logic modifications in the backend endpoint require careful integration verification across the UI, context, type, and server layers.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description Check ❓ Inconclusive The pull request description is empty and therefore does not provide any information about the changeset. While the check is lenient and does not require extensive detail, a completely blank description neither confirms relatedness to the changeset nor provides any meaningful context. The vagueness of a missing description makes it impossible to definitively assess whether the author intended to document their changes, resulting in insufficient information to conclusively determine if this meets the pass criteria.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "feat(app): Red Team Strategy Test Generation" directly and clearly describes the main objective of this changeset. The PR introduces comprehensive test case generation capabilities for red team strategies across multiple components, including a new TestCaseGenerationProvider, updates to the /generate-test backend endpoint, and integration with individual strategy items. The title uses a conventional commit format, is concise (6 words), and avoids vague terms or noise. A teammate reviewing the repository history would immediately understand that this PR adds a feature for generating test cases within the red team strategy workflow.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/strategy-test-gen

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
src/app/src/pages/redteam/setup/components/Targets/CustomPoliciesSection.tsx (1)

364-376: Missing dependency in useCallback

handleGenerateTestCase uses generateTestCase but doesn’t list it as a dependency. Add it to avoid stale closures.

   );
-    [toast],
+    [toast, generateTestCase],
   );
src/app/src/pages/redteam/setup/components/PluginsTab.tsx (1)

43-43: Use app ErrorBoundary component per guidelines

Swap react-error-boundary for @app/components/ErrorBoundary.

-import { ErrorBoundary } from 'react-error-boundary';
+import ErrorBoundary from '@app/components/ErrorBoundary';
-const ErrorFallback = ({ error }: { error: Error }) => (
-  <div role="alert">
-    <p>Something went wrong:</p>
-    <pre>{error.message}</pre>
-  </div>
-);
+// Removed: use app ErrorBoundary wrapper instead
-  return (
-    <ErrorBoundary FallbackComponent={ErrorFallback}>
+  return (
+    <ErrorBoundary>
       <Box sx={{ display: 'flex', gap: 3, alignItems: 'flex-start' }}>

As per coding guidelines.

Also applies to: 58-64, 396-399

src/server/routes/redteam.ts (1)

301-307: Sanitize logging in proxy route and avoid JSON.stringify of bodies.

Don't log raw request/response payloads. Use structured, minimal fields.

-logger.debug(
-  `Received ${task} task request: ${JSON.stringify({
-    method: req.method,
-    url: req.url,
-    body: req.body,
-  })}`,
-);
+logger.debug('Received task request', { task, method: req.method, url: req.url });
...
-const data = await response.json();
-logger.debug(`Received response from cloud function: ${JSON.stringify(data)}`);
-res.json(data);
+const data = await response.json();
+logger.debug('Received response from cloud function', { task, status: response.status });
+res.json({ success: true, data });

As per coding guidelines

Also applies to: 327-333

src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx (1)

92-108: Update response parsing to match ApiResponse standard in two locations.

The /redteam/generate-test endpoint (and /providers/test as noted) currently return non-standard response formats. Per the guideline, all server responses must use the ApiResponse shape: { success: boolean, data?: T, error?: string }. Client code must be updated to parse this format.

-const data = await response.json();
-
-if (data.error) {
-  throw new Error(data.error);
-}
-
-const testCase: GeneratedTestCase = {
-  prompt: data.prompt,
-  context: data.context,
-  metadata: data.metadata,
-};
+const api = await response.json();
+if (!api.success) {
+  throw new Error(api.error || 'Failed to generate test case');
+}
+const { prompt, context, metadata } = api.data || {};
+const testCase: GeneratedTestCase = { prompt, context, metadata };

Apply the same change to lines 117-123 (/providers/test call).

🧹 Nitpick comments (13)
test/index.test.ts (1)

71-75: Add global mock cleanup after each test

Ensure isolation across suites. Add a top‑level afterEach to reset/restore all mocks.

 jest.mock('../src/util/file');
 
+// Global mock cleanup for this suite
+afterEach(() => {
+  jest.resetAllMocks();
+  jest.restoreAllMocks();
+});
+
 describe('index.ts exports', () => {

As per coding guidelines.

src/app/src/pages/redteam/setup/components/strategies/types.ts (2)

8-11: Align selectedStrategy type with Strategy

Use Strategy | null for consistency with StrategyCardData.id and downstream usage.

 export interface ConfigDialogState {
   isOpen: boolean;
-  selectedStrategy: string | null;
+  selectedStrategy: Strategy | null;
 }

21-31: Tighten preset strategy typings

Prefer Strategy over string for preset definitions to catch typos at compile time.

 export interface StrategyPreset {
   name: string;
   description: string;
-  strategies: readonly string[];
+  strategies: readonly Strategy[];
   options?: {
     multiTurn?: {
       label: string;
-      strategies: readonly string[];
+      strategies: readonly Strategy[];
     };
   };
 }
src/app/src/pages/redteam/setup/components/PluginsTab.tsx (2)

367-388: Add generateTestCase to deps

handleGenerateTestCase references generateTestCase; include it in dependencies.

-  }, [pluginConfig]);
+  }, [pluginConfig, generateTestCase]);

656-661: Fix tooltip typo

“reqiures” → “requires”.

-                          ? 'This plugin reqiures remote generation'
+                          ? 'This plugin requires remote generation'
src/app/src/pages/redteam/setup/components/Strategies.tsx (1)

451-457: Use anchor for external link

Prefer a plain anchor (or MUI Link with component="a") for external URLs to avoid router overhead.

-            <RouterLink
-              style={{ textDecoration: 'underline' }}
-              to="https://www.promptfoo.dev/docs/red-team/strategies/"
-              target="_blank"
-            >
+            <a
+              href="https://www.promptfoo.dev/docs/red-team/strategies/"
+              target="_blank"
+              rel="noreferrer"
+              style={{ textDecoration: 'underline' }}
+            >
               Learn More
-            </RouterLink>
+            </a>
src/app/src/pages/redteam/setup/components/strategies/StrategyItem.tsx (2)

25-27: Add display-name fallback for tooltip

pluginDisplayNames may be undefined; fall back to plugin id to avoid “undefined”.

-const TEST_GENERATION_PLUGIN_DISPLAY_NAME = pluginDisplayNames[TEST_GENERATION_PLUGIN as Plugin];
+const TEST_GENERATION_PLUGIN_DISPLAY_NAME =
+  pluginDisplayNames[TEST_GENERATION_PLUGIN as Plugin] ?? TEST_GENERATION_PLUGIN;

15-17: Disable generate when Cloud not connected; limit disable to current item

Match PluginsTab/CustomPolicies behavior: disable when apiHealthStatus !== 'connected', and only disable the current item during generation.

-import { type StrategyConfig, type RedteamStrategyObject } from '@promptfoo/redteam/types';
+import { type StrategyConfig, type RedteamStrategyObject } from '@promptfoo/redteam/types';
+import { useApiHealth } from '@app/hooks/useApiHealth';
 export function StrategyItem({
@@
 }: StrategyItemProps) {
   const { config } = useRedTeamConfig();
+  const {
+    data: { status: apiHealthStatus },
+  } = useApiHealth();
         <TestCaseGenerateButton
           onClick={handleTestCaseGeneration}
-          disabled={isDisabled || generatingTestCase}
-          isGenerating={generatingTestCase && currentStrategy === strategy.id}
+          disabled={
+            isDisabled ||
+            apiHealthStatus !== 'connected' ||
+            (generatingTestCase && currentStrategy === strategy.id)
+          }
+          isGenerating={generatingTestCase && currentStrategy === strategy.id}

Also applies to: 22-24, 192-201

src/server/routes/redteam.ts (1)

107-110: Make injectVar configurable; avoid hardcoding 'query'.

Let plugin.config.indirectInjectionVar override injectVar and use the same for prompt extraction.

-// TODO: Add support for this? Was previously misconfigured such that the no value would ever
-// be passed in as a configuration option.
-const injectVar = 'query';
+// Prefer plugin-specified variable when provided (e.g., indirect injection targets)
+const injectVar = plugin.config.indirectInjectionVar || 'query';
...
-const generatedPrompt = testCase.vars?.[injectVar] || 'Unable to extract test prompt';
+const generatedPrompt = testCase?.vars?.[injectVar] ?? 'Unable to extract test prompt';

Also applies to: 160-162

src/app/src/pages/redteam/setup/components/TestCaseDialog.tsx (2)

68-73: Simplify plugin/strategy name derivation.

Types are already Plugin | null and Strategy | null; the typeof checks are redundant.

-const pluginName = typeof plugin === 'string' ? plugin : plugin || '';
+const pluginName = plugin ?? '';
...
-const strategyName = typeof strategy === 'string' ? strategy : strategy || '';
+const strategyName = strategy ?? '';

Also applies to: 74-76


83-86: Optional: handle absent strategy gracefully in title.

If strategy is null, omit the “/ …” suffix.

-? `Generated Test Case for ${pluginDisplayName} / ${strategyDisplayName}`
+? `Generated Test Case for ${pluginDisplayName}${strategy ? ` / ${strategyDisplayName}` : ''}`
src/redteam/types.ts (1)

50-56: Fix swapped BFLA/BOLA comments to avoid confusion.

Comments appear reversed relative to usage in UI/server.

-// BOLA
-targetIdentifiers: z.array(z.string()).optional(),
-// BFLA
-targetSystems: z.array(z.string()).optional(),
+// BFLA (function names/endpoints)
+targetIdentifiers: z.array(z.string()).optional(),
+// BOLA (object/system identifiers)
+targetSystems: z.array(z.string()).optional(),
src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx (1)

29-36: Reuse existing public types instead of introducing new ones.

Prefer RedteamPluginObject/RedteamStrategyObject to avoid parallel “Target*” types.

-import type { PluginConfig, StrategyConfig } from '@promptfoo/redteam/types';
+import type {
+  RedteamPluginObject as TargetPlugin,
+  RedteamStrategyObject as TargetStrategy,
+} from '@promptfoo/redteam/types';

Adjust the generateTestCase signature accordingly (no behavior change).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb8d7af and 06290a1.

📒 Files selected for processing (10)
  • src/app/src/pages/redteam/setup/components/PluginsTab.tsx (1 hunks)
  • src/app/src/pages/redteam/setup/components/Strategies.tsx (2 hunks)
  • src/app/src/pages/redteam/setup/components/Targets/CustomPoliciesSection.tsx (1 hunks)
  • src/app/src/pages/redteam/setup/components/TestCaseDialog.tsx (7 hunks)
  • src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx (7 hunks)
  • src/app/src/pages/redteam/setup/components/strategies/StrategyItem.tsx (5 hunks)
  • src/app/src/pages/redteam/setup/components/strategies/types.ts (1 hunks)
  • src/redteam/types.ts (3 hunks)
  • src/server/routes/redteam.ts (2 hunks)
  • test/index.test.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (13)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)

Prefer not to introduce new TypeScript types; use existing interfaces whenever possible

**/*.{ts,tsx}: Use TypeScript with strict type checking
Follow consistent import order (Biome will handle import sorting)
Use consistent curly braces for all control statements
Prefer const over let; avoid var
Use object shorthand syntax whenever possible
Use async/await for asynchronous code
Use consistent error handling with proper type checks
Always sanitize sensitive data before logging
Use structured logger methods (debug/info/warn/error) with a context object instead of interpolating secrets into log strings
Use sanitizeObject for manual sanitization in non-logging contexts before persisting or further processing data

Files:

  • src/app/src/pages/redteam/setup/components/Targets/CustomPoliciesSection.tsx
  • test/index.test.ts
  • src/server/routes/redteam.ts
  • src/app/src/pages/redteam/setup/components/strategies/StrategyItem.tsx
  • src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx
  • src/app/src/pages/redteam/setup/components/strategies/types.ts
  • src/app/src/pages/redteam/setup/components/TestCaseDialog.tsx
  • src/app/src/pages/redteam/setup/components/PluginsTab.tsx
  • src/app/src/pages/redteam/setup/components/Strategies.tsx
  • src/redteam/types.ts
src/app/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (src/app/CLAUDE.md)

src/app/src/**/*.{ts,tsx}: Never use fetch() directly; always use callApi() from @app/utils/api for all HTTP requests
Access Zustand state outside React components via store.getState(); do not call hooks outside components
Use the @app/* path alias for internal imports as configured in Vite

Files:

  • src/app/src/pages/redteam/setup/components/Targets/CustomPoliciesSection.tsx
  • src/app/src/pages/redteam/setup/components/strategies/StrategyItem.tsx
  • src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx
  • src/app/src/pages/redteam/setup/components/strategies/types.ts
  • src/app/src/pages/redteam/setup/components/TestCaseDialog.tsx
  • src/app/src/pages/redteam/setup/components/PluginsTab.tsx
  • src/app/src/pages/redteam/setup/components/Strategies.tsx
src/app/src/{components,pages}/**/*.tsx

📄 CodeRabbit inference engine (src/app/CLAUDE.md)

src/app/src/{components,pages}/**/*.tsx: Use the class-based ErrorBoundary component (@app/components/ErrorBoundary) to wrap error-prone UI
Access theme via useTheme() from @mui/material/styles instead of hardcoding theme values
Use useMemo/useCallback only when profiling indicates benefit; avoid unnecessary memoization
Implement explicit loading and error states for components performing async operations
Prefer MUI composition and the sx prop for styling over ad-hoc inline styles

Files:

  • src/app/src/pages/redteam/setup/components/Targets/CustomPoliciesSection.tsx
  • src/app/src/pages/redteam/setup/components/strategies/StrategyItem.tsx
  • src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx
  • src/app/src/pages/redteam/setup/components/TestCaseDialog.tsx
  • src/app/src/pages/redteam/setup/components/PluginsTab.tsx
  • src/app/src/pages/redteam/setup/components/Strategies.tsx
**/*.{tsx,jsx}

📄 CodeRabbit inference engine (.cursor/rules/react-components.mdc)

**/*.{tsx,jsx}: Use icons from @mui/icons-material
Prefer commonly used icons from @mui/icons-material for intuitive experience

Files:

  • src/app/src/pages/redteam/setup/components/Targets/CustomPoliciesSection.tsx
  • src/app/src/pages/redteam/setup/components/strategies/StrategyItem.tsx
  • src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx
  • src/app/src/pages/redteam/setup/components/TestCaseDialog.tsx
  • src/app/src/pages/redteam/setup/components/PluginsTab.tsx
  • src/app/src/pages/redteam/setup/components/Strategies.tsx
src/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

src/app/**/*.{ts,tsx}: In the React app (src/app), always use callApi from @app/utils/api for API calls instead of fetch()
React hooks: use useMemo for computed values and useCallback for functions that accept arguments

Files:

  • src/app/src/pages/redteam/setup/components/Targets/CustomPoliciesSection.tsx
  • src/app/src/pages/redteam/setup/components/strategies/StrategyItem.tsx
  • src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx
  • src/app/src/pages/redteam/setup/components/strategies/types.ts
  • src/app/src/pages/redteam/setup/components/TestCaseDialog.tsx
  • src/app/src/pages/redteam/setup/components/PluginsTab.tsx
  • src/app/src/pages/redteam/setup/components/Strategies.tsx
**/*.{test,spec}.{js,ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)

Avoid disabling or skipping tests unless absolutely necessary and documented

Files:

  • test/index.test.ts
test/**/*.{test,spec}.ts

📄 CodeRabbit inference engine (.cursor/rules/jest.mdc)

test/**/*.{test,spec}.ts: Mock as few functions as possible to keep tests realistic
Never increase the function timeout - fix the test instead
Organize tests in descriptive describe and it blocks
Prefer assertions on entire objects rather than individual keys when writing expectations
Clean up after tests to prevent side effects (e.g., use afterEach(() => { jest.resetAllMocks(); }))
Run tests with --randomize flag to ensure your mocks setup and teardown don't affect other tests
Use Jest's mocking utilities rather than complex custom mocks
Prefer shallow mocking over deep mocking
Mock external dependencies but not the code being tested
Reset mocks between tests to prevent test pollution
For database tests, use in-memory instances or proper test fixtures
Test both success and error cases for each provider
Mock API responses to avoid external dependencies in tests
Validate that provider options are properly passed to the underlying service
Test error handling and edge cases (rate limits, timeouts, etc.)
Ensure provider caching behaves as expected
Always include both --coverage and --randomize flags when running tests
Run tests in a single pass (no watch mode for CI)
Ensure all tests are independent and can run in any order
Clean up any test data or mocks after each test

Files:

  • test/index.test.ts
test/**/*.test.ts

📄 CodeRabbit inference engine (test/CLAUDE.md)

test/**/*.test.ts: Never increase Jest test timeouts; fix slow tests instead (avoid jest.setTimeout or large timeouts in tests)
Do not use .only() or .skip() in committed tests
Add afterEach(() => { jest.resetAllMocks(); }) to ensure mock cleanup
Prefer asserting entire objects (toEqual on whole result) rather than individual fields
Mock minimally: only external dependencies (APIs, databases), not code under test
Use Jest (not Vitest) APIs in this suite; avoid importing vitest
Import from @jest/globals in tests

Files:

  • test/index.test.ts
test/**

📄 CodeRabbit inference engine (test/CLAUDE.md)

Organize tests to mirror src/ structure (e.g., test/providers → src/providers, test/redteam → src/redteam)

Files:

  • test/index.test.ts
test/**/*.{test,spec}.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

test/**/*.{test,spec}.{ts,tsx}: Follow Jest best practices with describe/it blocks
Write tests that cover both success and error cases for all functionality

Files:

  • test/index.test.ts
src/server/**/*.ts

📄 CodeRabbit inference engine (src/server/CLAUDE.md)

src/server/**/*.ts: Sanitize all logged request/response data. Do not stringify or interpolate req/res directly; pass structured objects to the logger so sensitive fields are auto-redacted.
Use Drizzle ORM (with schema from src/database/schema.ts and helpers like eq) for database access instead of raw SQL.
Implement the server using Express 5 APIs for HTTP handling.

Files:

  • src/server/routes/redteam.ts
src/server/routes/**/*.ts

📄 CodeRabbit inference engine (src/server/CLAUDE.md)

src/server/routes/**/*.ts: Always validate request bodies with Zod schemas before processing route handlers.
Wrap all HTTP responses in the standard ApiResponse shape: { success, data? } on success and { success: false, error } on failure.
Use try/catch in route handlers; log errors and return 400 for validation errors and 500 for unexpected errors.
Use appropriate HTTP status codes (200, 201, 400, 404, 500) for API responses.
Organize API endpoint handlers in src/server/routes (e.g., routes/eval.ts, routes/config.ts, routes/results.ts, routes/share.ts).

Files:

  • src/server/routes/redteam.ts
src/redteam/**/*.ts

📄 CodeRabbit inference engine (src/redteam/CLAUDE.md)

src/redteam/**/*.ts: Always sanitize when logging test prompts or model outputs by passing them via the structured metadata parameter (second argument) to the logger, not raw string interpolation
Use the standardized risk severity levels: critical, high, medium, low when reporting results

Files:

  • src/redteam/types.ts
🧠 Learnings (4)
📚 Learning: 2025-10-05T16:59:20.507Z
Learnt from: CR
PR: promptfoo/promptfoo#0
File: src/redteam/CLAUDE.md:0-0
Timestamp: 2025-10-05T16:59:20.507Z
Learning: Applies to src/redteam/test/redteam/**/*.ts : Add tests for new plugins under test/redteam/

Applied to files:

  • src/server/routes/redteam.ts
  • src/app/src/pages/redteam/setup/components/PluginsTab.tsx
📚 Learning: 2025-10-05T16:59:20.507Z
Learnt from: CR
PR: promptfoo/promptfoo#0
File: src/redteam/CLAUDE.md:0-0
Timestamp: 2025-10-05T16:59:20.507Z
Learning: Applies to src/redteam/plugins/**/*.ts : Place vulnerability-specific test generators as plugins under src/redteam/plugins/ (e.g., pii.ts, harmful.ts, sql-injection.ts)

Applied to files:

  • src/server/routes/redteam.ts
  • src/app/src/pages/redteam/setup/components/PluginsTab.tsx
📚 Learning: 2025-10-05T16:59:20.507Z
Learnt from: CR
PR: promptfoo/promptfoo#0
File: src/redteam/CLAUDE.md:0-0
Timestamp: 2025-10-05T16:59:20.507Z
Learning: Applies to src/redteam/plugins/**/*.ts : New plugins must implement the RedteamPluginObject interface

Applied to files:

  • src/app/src/pages/redteam/setup/components/PluginsTab.tsx
  • src/redteam/types.ts
📚 Learning: 2025-10-05T16:59:20.507Z
Learnt from: CR
PR: promptfoo/promptfoo#0
File: src/redteam/CLAUDE.md:0-0
Timestamp: 2025-10-05T16:59:20.507Z
Learning: Applies to src/redteam/strategies/**/*.ts : Store attack transformation strategies under src/redteam/strategies/ (e.g., jailbreak.ts, prompt-injection.ts, base64.ts)

Applied to files:

  • src/app/src/pages/redteam/setup/components/Strategies.tsx
🧬 Code graph analysis (5)
src/server/routes/redteam.ts (5)
src/redteam/constants/plugins.ts (2)
  • ALL_PLUGINS (355-362)
  • REDTEAM_MODEL (9-9)
src/redteam/types.ts (2)
  • PluginConfigSchema (33-72)
  • StrategyConfigSchema (76-84)
src/redteam/constants/strategies.ts (1)
  • ALL_STRATEGIES (93-93)
src/redteam/strategies/index.ts (2)
  • Strategy (38-38)
  • Strategies (40-320)
src/redteam/plugins/index.ts (2)
  • Plugins (345-350)
  • PluginFactory (61-65)
src/app/src/pages/redteam/setup/components/strategies/StrategyItem.tsx (5)
site/docs/_shared/data/plugins.ts (1)
  • Plugin (70-80)
src/app/src/pages/redteam/setup/hooks/useRedTeamConfig.ts (1)
  • useRedTeamConfig (260-341)
src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx (1)
  • useTestCaseGeneration (218-224)
src/redteam/types.ts (2)
  • RedteamStrategyObject (114-117)
  • StrategyConfig (86-86)
src/app/src/pages/redteam/setup/components/TestCaseDialog.tsx (1)
  • TestCaseGenerateButton (424-449)
src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx (3)
src/redteam/types.ts (2)
  • PluginConfig (74-74)
  • StrategyConfig (86-86)
src/app/src/pages/redteam/setup/types.ts (1)
  • Config (27-41)
src/app/src/pages/redteam/setup/components/TestCaseDialog.tsx (1)
  • TestCaseDialog (49-422)
src/app/src/pages/redteam/setup/components/TestCaseDialog.tsx (1)
src/redteam/constants/metadata.ts (2)
  • displayNameOverrides (160-295)
  • categoryAliases (574-679)
src/app/src/pages/redteam/setup/components/Strategies.tsx (1)
src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx (1)
  • TestCaseGenerationProvider (60-216)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Test on Node 24.x and ubuntu-latest
  • GitHub Check: Test on Node 22.x and macOS-latest
  • GitHub Check: Test on Node 24.x and windows-latest
  • GitHub Check: Test on Node 22.x and windows-latest
  • GitHub Check: Test on Node 20.x and windows-latest
  • GitHub Check: webui tests
  • GitHub Check: Tusk Tester
🔇 Additional comments (5)
test/index.test.ts (1)

101-116: LGTM: new schema exports covered

Adding PluginConfigSchema and StrategyConfigSchema to expectedSchemaExports looks correct and aligned with the new public API.

src/app/src/pages/redteam/setup/components/strategies/types.ts (1)

1-6: Type-narrowing for id is good

Using Strategy for StrategyCardData.id improves safety and autocomplete. No issues.

src/app/src/pages/redteam/setup/components/PluginsTab.tsx (1)

377-384: LGTM: structured generation call

Passing { id, config } for plugin and strategy matches the new provider API.

src/app/src/pages/redteam/setup/components/Strategies.tsx (1)

36-37: LGTM: Provider integration

Wrapping the Strategies UI with TestCaseGenerationProvider is correct and keeps generation dialog in scope.

Also applies to: 531-647

src/app/src/pages/redteam/setup/components/TestCaseGenerationProvider.tsx (1)

77-90: Include strategy in telemetry and keep IDs only (good).

This change looks correct and helpful for analytics.

@will-holley will-holley self-assigned this Oct 23, 2025
@will-holley will-holley requested a review from mldangelo October 30, 2025 19:01
@use-tusk
Copy link
Contributor

use-tusk bot commented Oct 31, 2025

❌ Generated 52 tests - 42 passed, 10 failed (4160685) View tests ↗

Test Summary

  • PluginsTab - 10 ✅
  • Strategies - 2 ✅, 3 ❌
  • TestCaseDialog - 11 ✅, 2 ❌
  • TestCaseGenerateButton - 4 ✅, 1 ❌
  • TestCaseGenerationProvider - 11 ✅, 3 ❌
  • StrategyItem - 4 ✅, 1 ❌

Results

Tusk's tests show solid coverage of the core red team strategy test generation features, with 42 passing tests validating critical workflows like plugin selection, strategy configuration, test case generation, and error handling. However, 10 failing tests reveal gaps in edge case handling and error scenarios that need attention before this PR is production-ready. The failures cluster around three areas: incomplete null/undefined checks in the Strategies and TestCaseGenerationProvider components, missing error display logic in TestCaseDialog, and a disabled state issue in StrategyItem. These aren't blockers for the main feature but represent robustness issues that could surface in real usage.

Key Points

  • TestCaseGenerationProvider missing null guards: The generation effect doesn't validate that plugin and strategy are defined before making API calls, risking runtime errors or invalid requests. Add stricter precondition checks before triggering generation.

  • TestCaseDialog error handling incomplete: When targetResponse.error exists, the component still displays output instead of the error message. Users won't see failure reasons when tests fail. Prioritize error display over output in the Target Response section.

  • Strategies component crashes on null config.target: The handleStatefulChange function spreads config.target without null checks, causing TypeErrors during initialization or state transitions. Add optional chaining to safely handle missing target configurations.

  • Strategies doesn't validate strategy IDs: Invalid or null strategy IDs in config aren't filtered before determining checked state, potentially breaking the UI. Filter out invalid IDs before rendering checkboxes.

  • Strategies missing stateful mode validation: No warning when stateful mode is enabled with zero strategies selected, allowing misconfiguration. Add validation to warn users about this invalid state.

  • TestCaseGenerateButton unsafe API health access: Destructuring useApiHealth data without null checks causes crashes when the hook returns undefined. Use optional chaining to safely access data?.status.

  • StrategyItem test generation button not disabled for remote-disabled strategies: The button doesn't respect isRemoteGenerationDisabled, allowing users to attempt generation when remote is unavailable. Disable the button and show appropriate tooltip when remote generation is disabled.

  • TestCaseGenerationProvider missing error toast on target execution failure: When test execution fails due to invalid target config, no error toast is shown to the user. Add error notification in the catch block for test execution.


Symbols Not Exported

The following symbols were skipped because they are not exported and therefore cannot be tested. Export these symbols if you want them to be tested:

  • src/app/src/pages/redteam/setup/components/TestCaseDialog.tsx - (Section)
  • src/app/src/pages/redteam/setup/components/strategies/StrategyItem.tsx - (isRequiredConfigMissing)
Existing test issues While performing the check, Tusk fixed the following test files:
  • src/app/src/pages/redteam/setup/components/strategies/StrategyItem.test.tsx: The test file was missing a QueryClientProvider which caused react-query hooks to fail; the test setup was updated to include the provider, resolving the error.

Tests generated in these files will include these fixes. If you wish to only commit these fixes, you can do so in the Tusk app.

Please contact Support if you have questions.

View check history

Commit Status Output Created (UTC)
f749e75 ⏩ No tests generated Output Oct 23, 2025 5:52PM
06290a1 ⏩ Skipped due to new commit on branch Output Oct 23, 2025 6:46PM
1d20935 ⏩ Skipped due to new commit on branch Output Oct 23, 2025 6:54PM
456e20c ⏩ Skipped due to new commit on branch Output Oct 23, 2025 6:56PM
fc981ad ⏩ Skipped due to new commit on branch Output Oct 23, 2025 7:20PM
3d698e7 ❌ Generated 23 tests - 22 passed, 1 failed Tests Oct 23, 2025 7:23PM
33577c3 ⏩ Skipped due to new commit on branch Output Oct 24, 2025 2:22PM
3e72e4d ✅ Generated 20 tests - 20 passed Tests Oct 24, 2025 2:28PM
c9a91a1 ✅ Generated 22 tests - 22 passed Tests Oct 24, 2025 5:00PM
2d8347a ✅ Generated 29 tests - 29 passed Tests Oct 27, 2025 5:53PM
ef0c43b ❌ Generated 37 tests - 36 passed, 1 failed Tests Oct 30, 2025 5:13PM
b653061 ⏩ Skipped due to new commit on branch Output Oct 30, 2025 7:00PM
e8ca6a5 ⏩ Skipped due to new commit on branch Output Oct 30, 2025 7:03PM
6ae7f48 🔄 Running Tusk Tester Output Oct 30, 2025 7:05PM
53e7ec5 ⏩ Skipped due to new commit on branch Output Oct 30, 2025 8:02PM
6ec161c 🔄 Running Tusk Tester Output Oct 30, 2025 8:09PM
4160685 ❌ Generated 52 tests - 42 passed, 10 failed Tests Oct 31, 2025 4:11PM

View output in GitHub ↗

Was Tusk helpful? Give feedback by reacting with 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants