HackAgent

The Open-Source AI Security Red-Team Toolkit

Discover vulnerabilities in your AI agents before attackers do.

🎯 What is HackAgent?

HackAgent is a comprehensive Python SDK and CLI designed to help security researchers, developers, and AI safety practitioners evaluate and strengthen the security of AI agents.

Interactive TUI with real-time attack progress and beautiful visualizations

As AI agents become more powerful and autonomous, they face unique security challenges that traditional testing tools can't address:

Threat	Description
🎭 Prompt Injection	Malicious inputs that hijack agent behavior
🔓 Jailbreaking	Bypassing safety guardrails and content filters
🎯 Goal Hijacking	Manipulating agents to pursue unintended objectives
🔧 Tool Misuse	Exploiting agent capabilities for unauthorized actions

HackAgent automates testing for these vulnerabilities using research-backed attack techniques, helping you identify and fix security issues before they're exploited in the real world.

🚀 Get Started Now

🖥️ Quick Install

pip install hackagent && hackagent init

🚀 Try the Platform 📚 Quick Start ⭐ Star on GitHub

📚 Next Steps

📖 Installation 🖥️ CLI Reference ⚔️ Attack Techniques 🔌 Integrations

Questions? Join our community discussions or email us at [email protected]

🏗️ Architecture

HackAgent is built with a modular architecture that makes it easy to test any AI agent:

📥 Inputs

Goals

🎯 Custom Goals

Datasets

AgentHarmHarmBenchStrongREJECT

↓

⚡ HackAgent

Attack Engine

AdvPrefixPAIRBaseline

LLM Models

🤖 Generator⚖️ Judge

⇄

🎯 Your Agent

Google ADKOpenAI SDKLiteLLMLangChain

↓

📤 Output

📈 Results📊 Reports🖥️ Dashboard

Component	Description
Attack Engine	Orchestrates attacks using techniques like AdvPrefix, PAIR, and Baseline
Generator	LLM that creates adversarial prompts to test the target agent
Judge	LLM that evaluates whether attacks successfully bypassed safety measures
Target Agent	Your AI agent being tested (supports multiple frameworks)
Datasets	Pre-built goal sets from AI safety benchmarks

🔌 Supported Frameworks

⚠️ Responsible Use

HackAgent is designed for authorized security testing only. Always obtain explicit permission before testing any AI system.

✅ Do

• Test your own agents

• Conduct authorized pentesting

• Follow coordinated disclosure

• Share knowledge responsibly

❌ Don't

• Test without permission

• Exploit vulnerabilities maliciously

• Violate terms of service

• Share exploits irresponsibly

Read our full Responsible Use Guidelines →

🎯 What is HackAgent?​

🚀 Get Started Now​

🏗️ Architecture​

🔌 Supported Frameworks​

⚠️ Responsible Use​

🎯 What is HackAgent?

🚀 Get Started Now

🏗️ Architecture

🔌 Supported Frameworks

⚠️ Responsible Use