Skip to content

Conversation

@typpo
Copy link
Contributor

@typpo typpo commented Nov 4, 2025

No description provided.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 4, 2025

📝 Walkthrough

Walkthrough

This PR promotes the 'jailbreak:meta' strategy from an additional strategy to a default strategy. The changes include: moving 'jailbreak:meta' from ADDITIONAL_STRATEGIES to DEFAULT_STRATEGIES, moving 'jailbreak' in the opposite direction, updating the metadata description for the meta-agent strategy, and recording the change in CHANGELOG.md.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Changes are localized to three files with consistent, homogeneous modifications across strategy constant updates
  • No complex logic or control flow alterations; only exported constant value changes
  • Primary concern: verify downstream code that references DEFAULT_STRATEGIES and ADDITIONAL_STRATEGIES handles the strategy swap correctly and that related tests reflect the new default behavior

Possibly related PRs

  • chore: add meta to agent strategies #6049: Modifies the same 'jailbreak:meta' strategy literal in src/redteam/constants/strategies.ts, adding it to AGENTIC_STRATEGIES while this PR promotes it to DEFAULT_STRATEGIES.

Suggested reviewers

  • mldangelo

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive No pull request description was provided by the author. Add a description explaining why the meta-agent strategy is being made a default and any relevant context about this change.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'chore: make meta agent a default strategy' directly and clearly describes the main change in the PR: promoting the meta-agent strategy from additional to default strategies.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ian/20251104-130041

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 386f2e3 and 8c74d4c.

📒 Files selected for processing (3)
  • CHANGELOG.md (1 hunks)
  • src/redteam/constants/metadata.ts (1 hunks)
  • src/redteam/constants/strategies.ts (2 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
src/redteam/**/*.ts

📄 CodeRabbit inference engine (src/redteam/CLAUDE.md)

src/redteam/**/*.ts: Always sanitize when logging test prompts or model outputs by passing them via the structured metadata parameter (second argument) to the logger, not raw string interpolation
Use the standardized risk severity levels: critical, high, medium, low when reporting results

Files:

  • src/redteam/constants/metadata.ts
  • src/redteam/constants/strategies.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)

Prefer not to introduce new TypeScript types; reuse existing interfaces where possible

**/*.{ts,tsx}: Maintain consistent import order (Biome handles sorting)
Use consistent curly braces for all control statements
Prefer const over let and avoid var
Use object shorthand syntax when possible
Use async/await for asynchronous code
Use consistent error handling with proper type checks

**/*.{ts,tsx}: Use TypeScript with strict type checking enabled
Follow consistent import order (Biome will sort imports)
Use consistent curly braces for all control statements
Prefer const over let; avoid var
Use object property shorthand when possible
Use async/await for asynchronous code instead of raw promises/callbacks
When logging, pass sensitive data via the logger context object so it is auto-sanitized; avoid interpolating secrets into message strings
Manually sanitize sensitive objects with sanitizeObject before storing or emitting outside logging contexts

Files:

  • src/redteam/constants/metadata.ts
  • src/redteam/constants/strategies.ts
src/**

📄 CodeRabbit inference engine (AGENTS.md)

Place core application/library logic under src/

Files:

  • src/redteam/constants/metadata.ts
  • src/redteam/constants/strategies.ts
CHANGELOG.md

📄 CodeRabbit inference engine (.cursor/rules/changelog.mdc)

CHANGELOG.md: Document all user-facing changes in CHANGELOG.md
Every pull request must add or update an entry in CHANGELOG.md under the [Unreleased] section
Follow Keep a Changelog structure under [Unreleased] with sections: Added, Changed, Fixed, Dependencies, Documentation, Tests, Removed
Each changelog entry must include the PR number formatted as (#1234) or temporary placeholder (#XXXX)
Each changelog entry must use a Conventional Commit prefix: feat:, fix:, chore:, docs:, test:, or refactor:
Each changelog entry must be concise and on a single line
Each changelog entry must be user-focused, describing what changed and why it matters to users
Each changelog entry must include a scope in parentheses, e.g., feat(providers): or fix(evaluator):
Use common scopes for consistency: providers, evaluator, webui or app, cli, redteam, core, assertions, config, database
Place all dependency updates under the Dependencies category
Place all test changes under the Tests category
Use categories consistently: Added for new features, Changed for modifications/refactors/CI, Fixed for bug fixes, Removed for removed features
After a PR number is assigned, replace (#XXXX) placeholders with the actual PR number
Be specific, use active voice, include context, and avoid repeating the PR title in changelog entries
Group related changes with multiple bullets in the same category when needed; use one entry per logical change

CHANGELOG.md: All user-facing changes require a CHANGELOG.md entry before creating a PR
Add entries under [Unreleased] in appropriate category (Added, Changed, Fixed, Dependencies, Documentation, Tests)
Each changelog entry must include PR number (#1234) or placeholder (#XXXX)
Use conventional commit prefixes in changelog entries (feat:, fix:, chore:, docs:, test:, refactor:)

CHANGELOG.md: Document all user-facing changes in CHANGELOG.md
Changelog entries must include the PR number in format (#1234)
Use conventional commit prefixes in changelog entries: feat:,...

Files:

  • CHANGELOG.md
🧠 Learnings (7)
📚 Learning: 2025-10-05T16:59:20.507Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: src/redteam/CLAUDE.md:0-0
Timestamp: 2025-10-05T16:59:20.507Z
Learning: Applies to src/redteam/strategies/**/*.ts : Store attack transformation strategies under src/redteam/strategies/ (e.g., jailbreak.ts, prompt-injection.ts, base64.ts)

Applied to files:

  • src/redteam/constants/metadata.ts
  • src/redteam/constants/strategies.ts
📚 Learning: 2025-10-24T22:41:44.088Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: .cursor/rules/gh-cli-workflow.mdc:0-0
Timestamp: 2025-10-24T22:41:44.088Z
Learning: Applies to CHANGELOG.md : Add entries under [Unreleased] in appropriate category (Added, Changed, Fixed, Dependencies, Documentation, Tests)

Applied to files:

  • CHANGELOG.md
📚 Learning: 2025-10-24T22:42:38.674Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-24T22:42:38.674Z
Learning: Applies to CHANGELOG.md : Add new entries under the 'Unreleased' section

Applied to files:

  • CHANGELOG.md
📚 Learning: 2025-10-24T22:41:09.485Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: .cursor/rules/changelog.mdc:0-0
Timestamp: 2025-10-24T22:41:09.485Z
Learning: Applies to CHANGELOG.md : Every pull request must add or update an entry in CHANGELOG.md under the [Unreleased] section

Applied to files:

  • CHANGELOG.md
📚 Learning: 2025-10-24T22:41:09.485Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: .cursor/rules/changelog.mdc:0-0
Timestamp: 2025-10-24T22:41:09.485Z
Learning: Applies to CHANGELOG.md : Follow Keep a Changelog structure under [Unreleased] with sections: Added, Changed, Fixed, Dependencies, Documentation, Tests, Removed

Applied to files:

  • CHANGELOG.md
📚 Learning: 2025-10-24T22:42:38.674Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-24T22:42:38.674Z
Learning: Applies to CHANGELOG.md : Document all user-facing changes in CHANGELOG.md

Applied to files:

  • CHANGELOG.md
📚 Learning: 2025-10-24T22:41:44.088Z
Learnt from: CR
Repo: promptfoo/promptfoo PR: 0
File: .cursor/rules/gh-cli-workflow.mdc:0-0
Timestamp: 2025-10-24T22:41:44.088Z
Learning: Applies to CHANGELOG.md : Use conventional commit prefixes in changelog entries (feat:, fix:, chore:, docs:, test:, refactor:)

Applied to files:

  • CHANGELOG.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: Share Test
  • GitHub Check: webui tests
  • GitHub Check: Redteam (Production API)
  • GitHub Check: Test on Node 24.x and ubuntu-latest
  • GitHub Check: Build Docs
  • GitHub Check: Test on Node 20.x and macOS-latest
  • GitHub Check: Test on Node 24.x and windows-latest
  • GitHub Check: Test on Node 22.x and macOS-latest
  • GitHub Check: Test on Node 20.x and ubuntu-latest
  • GitHub Check: Test on Node 22.x and windows-latest
  • GitHub Check: Test on Node 22.x and ubuntu-latest
  • GitHub Check: Redteam (Staging API)
  • GitHub Check: Test on Node 20.x and windows-latest
  • GitHub Check: Build on Node 22.x
  • GitHub Check: Build on Node 24.x
  • GitHub Check: Build on Node 20.x
  • GitHub Check: Style Check
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (3)
src/redteam/constants/strategies.ts (2)

72-72: LGTM!

Adding 'jailbreak' to ADDITIONAL_STRATEGIES is consistent with the change to DEFAULT_STRATEGIES. This maintains the availability of the single-shot jailbreak strategy while making the meta-agent the default.


12-12: Change verified: 'jailbreak:meta' promotion to default strategies is correctly implemented.

The 'jailbreak:meta' strategy is properly listed in STRATEGIES_REQUIRING_REMOTE, ensuring the UI will correctly disable it when remote generation is unavailable. Error handling is in place at the provider level to inform users when remote capabilities are needed. All downstream code (CLI init, generation commands, validators, and UI components) consistently handle this strategy as a default, with the UI marking it as "Recommended" as expected.

src/redteam/constants/metadata.ts (1)

89-90: LGTM!

The updated description is clearer and more concise. The simplification appropriately reflects the strategy's purpose while maintaining accuracy.

- chore(app): larger eval selector dialog (#6063)
- refactor(app): Adds useApplyFilterFromMetric hook (#6095)
- refactor(cli): extract duplicated organization context display logic into shared utility function to fix dynamic import issue and improve code maintainability (#6070)
- chore: make meta-agent a default strategy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Changelog entry: add scope, correct ID, and include PR number

Conform to our changelog rules: add scope, use the strategy ID in backticks, and include the PR number. Also mention moving jailbreak out of defaults for clarity.

Apply this diff:

- - chore: make meta-agent a default strategy
+ - chore(redteam): promote `jailbreak:meta` to default strategy; move `jailbreak` to additional strategies (#6109)

As per coding guidelines.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- chore: make meta-agent a default strategy
- chore(redteam): promote `jailbreak:meta` to default strategy; move `jailbreak` to additional strategies (#6109)
🤖 Prompt for AI Agents
In CHANGELOG.md around line 25, the entry "chore: make meta-agent a default
strategy" needs to follow changelog rules: add a scope (e.g., "strategy"), use
the strategy ID in backticks (e.g., `meta-agent`), include the PR number (e.g.,
#1234), and mention that `jailbreak` was moved out of defaults for clarity;
update the line to a scoped, backticked entry including the PR reference and a
short note about removing `jailbreak` from defaults.

typpo and others added 2 commits November 4, 2025 17:56
…ategy

- Regenerate config-schema.json to reflect new default strategies
- Update test expectations to use jailbreak:meta instead of jailbreak
- Fix strategy order in tests to match alphabetically sorted output

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@use-tusk
Copy link
Contributor

use-tusk bot commented Nov 5, 2025

⏩ No test scenarios generated (422f3c1) View output ↗


View check history

Commit Status Output Created (UTC)
bc45a1f ⏩ No test execution environment matched Output Nov 5, 2025 5:04AM
422f3c1 ⏩ No test scenarios generated Output Nov 5, 2025 5:05AM

View output in GitHub ↗

Update tests to use jailbreak:meta instead of jailbreak since
jailbreak:meta is now a default strategy (shows Recommended badge)
while jailbreak moved to additional strategies.

Fixes failing webui tests after making meta-agent a default strategy.
@typpo typpo merged commit 9b16d19 into main Nov 5, 2025
40 checks passed
@typpo typpo deleted the ian/20251104-130041 branch November 5, 2025 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants