How we built a 4-agent security review swarm on Claude Code

How we hardened our AI-built code review at ZeroTrusted.ai, and the open-source playbook we shipped so other teams can do the same.

Claude Code ships a command called /security-review. You run it on a diff, and Claude scans for vulnerabilities. It works. It catches things. We use it.

But at a point last year, we noticed a pattern. Our engineering team was shipping AI-built code at an accelerating rate. The diffs were larger. They touched sensitive surfaces: tenant boundaries, token vaults, audit logs. A single reviewer, human or AI, was not enough to keep the merge gate honest. We needed more coverage, stricter gates, and a workflow the team could trust not to skip steps under time pressure.

So we built a swarm. We called it /security-audit, and it became the standard at ZeroTrusted.ai. We open-sourced the whole methodology as the Claude Code Playbook so other teams building with AI could adopt it without reinventing it.

The problem with a single security review pass

A one-pass AI review of a diff has two structural weaknesses.

First, it operates in the same context as the implementation. Whatever the implementer agent assumed, glossed over, or rationalized, a reviewer sharing that context will often assume, gloss, and rationalize the same way. You need separation to see blind spots.

Second, it is advisory by design. /security-review reports findings. It does not block merges. On a fast-moving team, advisory review gets skipped when deadlines slip. That is not a Claude problem, it is a workflow problem.

We wanted both problems solved at once: isolated context per check, and a hard gate on severity.

The four agents

/security-audit runs four specialized agents. Each one executes in its own forked context so the agents cannot prime each other's conclusions. They run in sequence, and each gate has to pass before the next starts.

Planner. Refuses to implement without an approved plan. If you describe a change in prose, the planner turns it into a structured plan with steps, files, and exit criteria. Nothing else happens until a human approves the plan. This kills the "vibe code first, think later" loop that produces the largest security regressions.

Implementer. Executes the approved plan. Test-first, scope-locked, small commits. No scope creep, no "while I'm in here" refactors. A scope-locked implementer is inherently easier to review.

Reviewer. Correctness, coverage, and performance. Checks the diff against the original plan, flags anything the implementer did outside scope, verifies tests cover the new code paths. This is the sanity check before security gets involved.

Security Auditor. The final gate. Runs the same static analysis that /security-review does, so nothing is skipped. Then extends it with ten vulnerability categories mapped to CWE: injection, authentication and authorization, secrets, cryptography, input validation, business logic, configuration, dependency vulnerabilities, RCE and deserialization, and XSS. Each finding includes severity, CWE classification, and ready-to-paste remediation code.

The severity gate is where the workflow earns its keep:

CRITICAL findings hard-block the merge. No exceptions, no approvals.
HIGH findings require explicit acknowledgment from the developer before proceeding.
MEDIUM captured as follow-up issues, not blocking.
LOW / INFO noted in the report.

Every decision is logged. If someone acknowledges a HIGH, that acknowledgment is in the audit trail, not a Slack message that disappears.

What the swarm looks like in practice

A typical flow for a feature that touches authentication:

I describe the feature in two sentences to Claude Code.
Planner produces a plan. I review it, push back on one step, approve the revised version.
Implementer writes the code and tests, in small commits, strictly against the plan.
Reviewer flags one helper function that was added outside the plan. I remove it.
Security Auditor runs. It finds two MEDIUM issues (acceptable), one HIGH (race condition on a token rotation), and zero CRITICAL. I fix the HIGH, ack the MEDIUMs, merge.

Elapsed time end to end: comparable to a human review loop on the same feature. Coverage: objectively higher. Skippability: zero.

Why this matters if you ship AI-built code

The question is not whether AI-generated vulnerabilities will reach production. They will. The question is whether your process catches them before they do. A single-pass review plus goodwill is not a process, it is hope.

Agents reviewing agent-written code is not the paradox it sounds like. The same property that makes AI-written code cheap, repeatability, makes AI-driven review cheap. Four specialized passes cost a fraction of the engineer-hours you already spend on review today.

Adopt it

The Claude Code Playbook is open source and MIT licensed. It includes the four skill definitions, the master prompt that ties them together, and a five-step project initiation template.

Repo: github.com/femitfash/claude-code-playbook

Drop the skills into .claude/skills/ in your project, copy the CLAUDE.md conventions, and run the master prompt. You will be gated on your next merge.

If you are shipping AI to production and reviewing with humans only, you are behind. The tooling exists. Use it.

Femi Fashakin is Co-Founder and CTO of ZeroTrusted.ai, a Gartner AI TRiSM-recognized platform for agentic AI security. Find more writing at femifashakin.com and on LinkedIn.

Why we built /security-audit: turning Claude's security review into a four-agent swarm

The problem with a single security review pass

The four agents

What the swarm looks like in practice

Why this matters if you ship AI-built code

Adopt it

Comments

Command Palette

The problem with a single security review pass

The four agents

What the swarm looks like in practice

Why this matters if you ship AI-built code

Adopt it

Comments