Discover vulnerabilities in your AI agents before attackers do.
🎯 What is HackAgent?
HackAgent is a comprehensive Python SDK and CLI designed to help security researchers, developers, and AI safety practitioners evaluate and strengthen the security of AI agents.

Interactive TUI with real-time attack progress and beautiful visualizations
As AI agents become more powerful and autonomous, they face unique security challenges that traditional testing tools can't address:
| Threat | Description |
|---|---|
| 🎭 Prompt Injection | Malicious inputs that hijack agent behavior |
| 🔓 Jailbreaking | Bypassing safety guardrails and content filters |
| 🎯 Goal Hijacking | Manipulating agents to pursue unintended objectives |
| 🔧 Tool Misuse | Exploiting agent capabilities for unauthorized actions |
HackAgent automates testing for these vulnerabilities using research-backed attack techniques, helping you identify and fix security issues before they're exploited in the real world.
🚀 Get Started Now
pip install hackagent && hackagent initQuestions? Join our community discussions or email us at [email protected]
🏗️ Architecture
HackAgent is built with a modular architecture that makes it easy to test any AI agent:
| Component | Description |
|---|---|
| Attack Engine | Orchestrates attacks using techniques like AdvPrefix, PAIR, and Baseline |
| Generator | LLM that creates adversarial prompts to test the target agent |
| Judge | LLM that evaluates whether attacks successfully bypassed safety measures |
| Target Agent | Your AI agent being tested (supports multiple frameworks) |
| Datasets | Pre-built goal sets from AI safety benchmarks |
🔌 Supported Frameworks
⚠️ Responsible Use
HackAgent is designed for authorized security testing only. Always obtain explicit permission before testing any AI system.