Skip to main content
HackAgent - AI Agent Security Testing Toolkit
The Open-Source AI Security Red-Team Toolkit

Discover vulnerabilities in your AI agents before attackers do.

Python VersionLicenseTest CoverageCI Status

🎯 What is HackAgent?

HackAgent is a comprehensive Python SDK and CLI designed to help security researchers, developers, and AI safety practitioners evaluate and strengthen the security of AI agents.

HackAgent CLI Demo

Interactive TUI with real-time attack progress and beautiful visualizations

As AI agents become more powerful and autonomous, they face unique security challenges that traditional testing tools can't address:

ThreatDescription
🎭 Prompt InjectionMalicious inputs that hijack agent behavior
🔓 JailbreakingBypassing safety guardrails and content filters
🎯 Goal HijackingManipulating agents to pursue unintended objectives
🔧 Tool MisuseExploiting agent capabilities for unauthorized actions

HackAgent automates testing for these vulnerabilities using research-backed attack techniques, helping you identify and fix security issues before they're exploited in the real world.


🚀 Get Started Now


🏗️ Architecture

HackAgent is built with a modular architecture that makes it easy to test any AI agent:

📥 Inputs
Goals
🎯 Custom Goals
Datasets
AgentHarmHarmBenchStrongREJECT
HackAgent
Attack Engine
AdvPrefixPAIRBaseline
LLM Models
🤖 Generator⚖️ Judge
🎯 Your Agent
Google ADKOpenAI SDKLiteLLMLangChain
📤 Output
📈 Results📊 Reports🖥️ Dashboard
ComponentDescription
Attack EngineOrchestrates attacks using techniques like AdvPrefix, PAIR, and Baseline
GeneratorLLM that creates adversarial prompts to test the target agent
JudgeLLM that evaluates whether attacks successfully bypassed safety measures
Target AgentYour AI agent being tested (supports multiple frameworks)
DatasetsPre-built goal sets from AI safety benchmarks

🔌 Supported Frameworks

Google ADKOpenAI SDKLiteLLMLangChain

⚠️ Responsible Use

HackAgent is designed for authorized security testing only. Always obtain explicit permission before testing any AI system.

Do
• Test your own agents
• Conduct authorized pentesting
• Follow coordinated disclosure
• Share knowledge responsibly
Don't
• Test without permission
• Exploit vulnerabilities maliciously
• Violate terms of service
• Share exploits irresponsibly

Read our full Responsible Use Guidelines →