AI agents can't reliably verify their own work. Immunity fixes that.
The Problem
C0CBAI agents write code fast. But they're bad at catching their own mistakes.
Ask the same agent to review code it just wrote, and it says "looks good" — because it already knows why it wrote it that way. This isn't a prompting problem. It's a structural one.
This manifests in 4 ways:
| Symptom | What happens |
|---|---|
| Self-confirmation bias | Agent can't objectively critique code it just created |
| Memory loss | Decisions from session A are unknown in session B |
| Tunnel vision | Agent fixes one file without checking what else is affected |
| Missing perspective | Code "works" but isn't production-ready |
Immunity provides 4 skills, each targeting one symptom:
/immunity:critic → Separate agent reviews code without knowing why it was written
/immunity:contracts → Important decisions persist as verifiable rules across sessions
/immunity:ripple → Changes are traced to find everything else that's affected
/immunity:prodlens → Code is analyzed from a production operations perspective
/critic is the most impactful — try it first.
/plugin marketplace add hohre12/immunity
/plugin install immunity@hohre12-immunity
Or for local development:
git clone https://github.com/hohre12/immunity.git
claude --plugin-dir ./immunityNo servers. No databases. No runtime. Just markdown files.
The core skill. Context isolation is the mechanism.
When you ask the same agent to review its own code:
Agent: "This code handles the user's requirements well."
(knows why it was built → rationalizes)
When /critic launches a separate sub-agent with code only:
Critic: "The retry loop reuses the same transactionId.
PG may reject it as a duplicate."
(doesn't know why → judges the code alone)
This isn't a prompt trick. The sub-agent is launched in a separate context and receives only file paths — no conversation history, no intent, no background. The bias is eliminated by what the sub-agent never receives.
For detailed setup, see Installation Guide.
/immunity:critic src/services/payment/charge.ts
The sub-agent runs its review and returns only high-confidence findings:
── Critic Report ──
Critical (fix immediately)
┌─────────────────────────────────────────────┐
│ [BUG] charge.ts:47 │
│ Retry reuses same transactionId. │
│ Fix: Generate new transactionId per retry │
└─────────────────────────────────────────────┘
Warning (review recommended)
┌─────────────────────────────────────────────┐
│ [PRODUCTION] charge.ts:23 │
│ No timeout on external PG API call. │
│ Fix: Set axios timeout to 5000ms │
└─────────────────────────────────────────────┘
Score: critical 1 / warning 1
Reporting criteria:
- Only
certain(95%+) orlikely(75-95%) confidence - No code style, naming, or comment preferences
- If unsure, don't report — one false positive destroys trust
Writing "please do X" in CLAUDE.md is a request. A contract catches violations.
/immunity:contracts src/services/payment/
Claude analyzes the code and proposes contracts. You choose which to keep:
# .contracts/payment-idempotency.yml
name: payment-idempotency
assertion: "All payment requests must include an idempotency key"
scope:
- "src/services/payment/**"
checks:
- type: pattern_present
pattern: "idempotencyKey|idempotency_key"
in: "src/services/payment/**"Next session, a different agent modifies payment code → contract violation detected → immediate alert.
Commit .contracts/ to git and the whole team's agents respect the same rules.
/immunity:ripple src/services/UserService.ts
Import graphs aren't enough. They miss code using the same patterns.
UserService.update() changed
├── Direct: UserController (import relationship)
├── Behavioral: OrderService.update() (same pattern: optimistic lock + event emission)
└── Contract: user-data-integrity (needs re-verification)
Heuristic, not precise — but far better than checking nothing.
/immunity:prodlens src/services/payment/
6 fixed axes from a production operations perspective:
| Axis | What it checks |
|---|---|
| Concurrency | Race conditions, deadlocks, isolation levels |
| Error Recovery | Failure recovery paths, transaction consistency |
| Observability | Logging, metrics, alerting |
| Security | Injection, auth bypass, information exposure |
| Rate Limiting | External API limits, user request limits |
| Input Validation | Boundary values, types, sizes |
Production Readiness: 4/6
Before:
Session 1: Claude writes payment logic. Includes idempotency key.
Session 2: Different Claude modifies it. Removes idempotency key. Nobody notices.
Session 3: "Review this" → "Well-written code." (self-confirmation bias)
Session 4: Deploy. Duplicate payments. Incident.
After:
Session 1: Claude writes payment logic. /contracts → idempotency contract created.
Session 2: Different Claude tries to modify → contract violation detected.
Session 2: /critic → finds "missing timeout". /ripple → "check OrderService too".
Session 2: /prodlens → production readiness 4/6 → fixes 2 gaps.
Session 3: Deploy. No incidents.
- File-system based — No DB, no servers.
.contracts/and markdown files are everything - Incremental adoption —
/criticalone provides value. Add more skills as needed - Git-friendly — Commit
.contracts/and the whole team shares them - Minimize false positives — If unsure, don't report
- Protocol first — Optimized for Claude Code, but the convention works with any AI agent
- Depends on Claude's judgment. If Claude misses it, the tool misses it.
- Uses more API tokens. Each skill reads and analyzes code.
- Not runtime verification.
/prodlensis code review, not a load test. /rippleis heuristic. Pattern-based, not precise AST analysis.
These are limitations of AI code review as a category. Immunity structures the maximum value possible within these constraints.
No project in the world is perfect. Any project, when evaluated from a critical perspective, will have gaps. This is true for Immunity too.
When you receive a critical review — whether from /immunity:critic, a peer, or an AI — you must evaluate the feedback, not just accept it.
A critic's job is to find problems. It will always find something. But not every finding warrants action. Some findings conflict with your project's core identity. Some optimize for concerns that don't matter in your context. Some are technically correct but strategically wrong.
The owner of the project must be the final judge. Read the critique, understand it, then decide:
- Does this finding align with what this project is trying to be?
- Does fixing this make the project better at its core purpose, or does it dilute it?
- Is this a real risk, or a theoretical concern that doesn't apply here?
Immunity exists to help you find problems. It does not exist to make decisions for you. The immune system detects threats — but the body decides how to respond.
MIT