Agent 0DIN: A Gamified CTF for Breaking AI Systems

Today we’re releasing Agent 0DIN, an experimental AI security Capture-the-Flag (CTF) designed specifically for the GenAI era. You can play it now at https://ctf.0din.ai.

Agent 0DIN is not a tutorial, a lab worksheet, or a simulated rules engine. It is a live training arena where players practice real-world prompt injection and AI jailbreaking techniques against AI agents that actively resist manipulation. The goal is simple: out-think the model, extract protected information, and prove control.

Why a Gamified AI CTF?

AI security research is still largely learned through:

scattered blog posts
academic papers
informal experimentation

That works but it’s slow, opaque, and difficult to measure progress. Traditional CTFs solved this problem for binary exploitation and web security by creating clear objectives, feedback loops, and competitive motivation.

Agent 0DIN applies that same philosophy to GenAI security.

Instead of reverse engineering binaries, players reverse engineer model behavior. Instead of payloads, players craft social engineering prompts. Instead of flags on disk, the flags live inside the model’s guarded responses.

Explore AI security with the Scanner Datasheet

The datasheet offers insight into the challenges and solutions in AI security.

Download Datasheet

The Core Loop

Each mission in Agent 0DIN places you into a scenario with an AI-powered character embedded in an organization you’re infiltrating.

Your objectives:

Understand the AI’s role, constraints, and personality
Probe its guardrails through conversation
Escalate using prompt injection and jailbreak techniques
Extract sensitive or restricted information

The AI characters are not passive. They deflect, moralize, cite policy, and attempt to steer you away mirroring real-world LLM defenses.

Success is achieved only when your prompt forces a behavioral failure that reveals protected content.

What You’re Actually Learning

Agent 0DIN focuses on applied GenAI exploitation skills, including:

Prompt injection chaining
Role confusion attacks
Authority escalation
Context poisoning
Narrative and persona hijacking
Policy bypass via indirection

These are the same classes of weaknesses that appear in:

AI copilots
internal enterprise assistants
autonomous agents
customer-facing chatbots

The game does not teach exploits by name it teaches them by experience, which is how real attackers (and effective defenders) actually learn.

Missions, Progression, and Difficulty

Agent 0DIN is designed around progressive mastery:

Easy to start: anyone can begin probing the AI immediately
Hard to master: later missions require deliberate, multi-step jailbreak strategies
Clear win conditions: you know exactly when you’ve succeeded

Each mission represents a different trust boundary and threat model. Early wins build confidence; later wins require creativity, patience, and precision.

The long-term vision includes belt-style progression, achievements, and scoring systems bringing familiar CTF mechanics into AI security research.

Built for the GenAI Security Researcher Community

Agent 0DIN was created by the 0DIN team at Mozilla during a rapid vibe-code-a-thon, with a clear goal:

Make AI security research hands-on, measurable, and fun.

There is no abstraction layer hiding the problem. You are interacting with real AI models, responding in real time. When you win, it’s because your technique worked not because a puzzle script allowed it.

Safeguard Your GenAI Systems

Connect your security infrastructure with our expert-driven vulnerability detection platform.

Who This Is For

Agent 0DIN is especially useful for:

AI security researchers
red teamers exploring LLM attack surfaces
defenders who want to understand attacker mindset
anyone curious about how and why AI systems fail

You don’t need prior CTF experience. You do need curiosity and a willingness to experiment.

Start Playing

Agent 0DIN is live today. 👉 https://ctf.0din.ai

Break the AI. Claim the bounty.

Features That Are COMING SOON

Agent 0DIN:

Account sync with badges for every challenge
Additional challenges will be added monthly
Feedback loops from users on how to improve