[NEW AGENT] claude-code-audit-stack — 3 adversarial verification subagents (bot-deploy-verifier, claim-auditor, remote-agent-dispatcher)

## Preliminary Checks

- [x] Read Code of Conduct
- [x] Reviewed existing subagents — this collection is not a duplicate (no equivalent post-deploy verifier, quantitative claim auditor, or PID-correct remote agent dispatcher in the current set)
- [x] Proposal is for legitimate, constructive use cases only
- [x] I have domain expertise (3+ years building & operating production trading bots and quant research pipelines, where these specific failure modes occur)

## Proposal scope

This is a proposal for a **collection of three tightly-coupled subagents** that together form an adversarial-verification stack, plus one PostToolUse hook. They are submitted as one collection because they share a unifying design principle (assume the agent above lied unless every check independently passes) and because at least one — `claim-auditor` — is auto-fired by a hook in the collection.

If maintainers prefer, this could be filed as 3 separate proposals; let me know which fits your queue better. The collection is already public, MIT-licensed, schema-aligned, and production-tested at https://github.com/LaterKidsXD/claude-code-audit-stack.

## Proposed Agent Names

1. `bot-deploy-verifier`
2. `claim-auditor`
3. `remote-agent-dispatcher`

Plus a `PostToolUse` hook (`audit-on-report-write`) that auto-fires `claim-auditor` on every `*.report.md` write.

## Domain Expertise

The collection specializes in **catching silent failures that other agents miss** — failures where the upstream agent reports `DONE` or `PASS` but the actual end-state of the system is wrong:

1. **`bot-deploy-verifier`** — systemd / journalctl / git-driven verification. Catches the silent-skip pattern where an agent edits a config file but never restarts the service (the file is correct on disk, the running process is on old in-memory config), plus accidental cascade restarts of `BindsTo=`/`Requires=` siblings.

2. **`claim-auditor`** — adversarial quantitative review of Markdown reports. Catches probability stacking (`1−(1−p)^N` vs `N×p`), conditional vs marginal pass-rate confusion, percentage vs percentage-points mixups, best-of-N selection bias, bootstrap-with-replacement implications, sample-size red flags. Read-only (`tools: Read`).

3. **`remote-agent-dispatcher`** — mechanical scp+spawn for autonomous Claude Code agents on remote hosts via SSH. Captures the actual `claude` binary PID via `pgrep` (not the bash-wrapper PID via `$!` — the common trap when wrapping over SSH).

## Unique Value Proposition

Unlike the existing breadth-focused subagents in this collection (which excel at domain expertise: Go, security, frontend, DBA, etc.), this stack is **narrow and adversarial-by-design**. Each agent assumes the agent above it might have skipped a step, and re-checks from scratch via `systemctl`/`journalctl`/`git`/`grep`. They surface explicit `BLOCKED` (not `DONE`) when any gate fails — agents are trained to drive toward DONE, and that's exactly the failure mode you want a separate verification process to catch.

Specifically, no agent in the current set covers:
- Post-deploy state verification of running systemd services (the deployment-engineer/devops-engineer agents do deploy planning, not adversarial post-deploy verification)
- Quantitative auditing of Markdown reports with structured P1/P2/P3 severity
- Remote agent spawning with PID-correct wrapper-aware capture

The "blocked, not done" distinction is the load-bearing design property — it allows these to compose with other agents without false positives propagating downstream.

## Primary Use Cases

**`bot-deploy-verifier`:**
1. Verify any `systemctl restart` actually landed (config edit went through, service reloaded, running journal shows new values)
2. Catch accidental cascade restarts of unrelated services (e.g., `BindsTo=`/`Requires=` dependencies firing as side effects)
3. Auto-rollback to a known-good config when verification fails (when caller provides `backup_path`)
4. Refuse to declare a deploy DONE until all gates green; explicit BLOCKED otherwise

**`claim-auditor`:**
1. Audit autonomous research output (eval reports, MC results, backtest summaries) before acting on probability/EV claims
2. Surface decision-shaping math errors that human reviewers glide past on the 14th report of the day
3. Run as a post-processing step on any agent-generated quantitative analysis
4. Return structured JSON for CI integration / pipe-into-other-agents

**`remote-agent-dispatcher`:**
1. Delegate tasks that exceed Claude Code's interactive timeout to a remote VPS
2. Run long-running autonomous agents on a host different from the human
3. Capture PID correctly across SSH+sudo+nohup+bash wrapping (where `$!` returns the wrapper, not the binary)
4. Survive SSH disconnect via `nohup` + verified PID file

## Required Tools & Capabilities

- `bot-deploy-verifier` — `tools: Bash` (executes systemctl, journalctl, git over SSH; reads journal output)
- `claim-auditor` — `tools: Read` (read-only — never modifies the report it audits; this is a security property)
- `remote-agent-dispatcher` — `tools: Bash, Read` (Bash for scp/ssh/install; Read for spec-header validation only)

No external network calls beyond what `Bash` allows the user to invoke. No hardcoded credentials, paths, or hostnames (all user-facing examples use `<your-ssh-host>` placeholders). No filesystem mutations by `claim-auditor`.

## Example Interactions

**Example 1 — bot-deploy-verifier catches a silent skip:**

```
User: "Deploy the new trail-stop config to trading-bot.service"
Caller agent: "Edited config/live.yaml: trailing_stop_min_dist: 0. Restarted service. DONE."

bot-deploy-verifier (invoked after restart):
- service active: ✅
- config-drift gate: ❌ FAILED — journal shows trailing_stop_min_dist: 4 (old value)
- expected keys verified: 0/4
- untouched siblings: ✅

Returns: BLOCKED, NO ROLLBACK
Cause: service was restarted at 14:42 UTC but the journal shows it loaded the OLD config from before the edit (the agent edited the wrong file or the systemd unit references a different path).
```

**Example 2 — claim-auditor catches probability stacking:**

```
User: "Audit eval_results.md before I commit to the strategy"

claim-auditor (severity_floor: P2):

# Claim Audit — eval_results.md

## P1 (decision-shaping, must address)
| Quote | Why wrong | Correct number |
|---|---|---|
| "3 evals at 35.7% per eval = ~90% chance one passes" | P(≥1 of N) = 1−(1−p)^N, not N×p | 73.4%, not 90% |

## Summary
- 1 P1 error / 0 P2 / 0 P3
- Recommended action: Recompute eval-cost economics before committing capital
```

**Example 3 — remote-agent-dispatcher captures correct PID:**

```
User: "Dispatch HANDOFF_LONG_BACKTEST.md to my-vps"

remote-agent-dispatcher:
- spec validated locally (HANDOFF header present)
- scp + install on host: ✅
- spawn pattern executed (nohup + sudo -u daemon + sleep 18s)
- pgrep -f basename: 1 claude binary PID found, PID 18432
- PID file written: /opt/agent-work/HANDOFF_LONG_BACKTEST_agent.pid
- done-conditions verified (file exists, PID owned by daemon, alive ≥5s)

Returns: DISPATCHED as PID 18432 on my-vps
Caller schedules first heartbeat check in 15-30 min.
```

## Your Expertise

- 3+ years building and operating production trading bots (futures, crypto, prediction markets) — the failure modes these agents catch are drawn directly from real production incidents
- Documented $106K backtest swing caused by a silent-skipped config restart (the originating incident for `bot-deploy-verifier`): https://github.com/LaterKidsXD/claude-code-audit-stack/blob/main/docs/silent-skip-incident.md
- Real audit findings (redacted) showing `claim-auditor` catching probability-stacking errors in autonomous research output: https://github.com/LaterKidsXD/claude-code-audit-stack/blob/main/reports/sample-findings.md
- All three subagents are in active production use against the live systems they were built for

## Additional Context

- **Repository:** https://github.com/LaterKidsXD/claude-code-audit-stack (MIT, public)
- **Already-shipped GitHub Action wrapping `claim-auditor`:** https://github.com/marketplace/actions/claim-auditor
- **Anthropic Claude Code plugin marketplace:** submitted 2026-05-03, currently under review

If maintainers green-light this proposal, I'm happy to submit it as a polished PR following the `plugins/<plugin-name>/` layout used by recent community plugins (e.g., `signed-audit-trails` PR #496, `protect-mcp` PR #503). All schema requirements (subagent frontmatter, tool restrictions, MIT license, no hardcoded paths) are already met in the upstream repo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NEW AGENT] claude-code-audit-stack — 3 adversarial verification subagents (bot-deploy-verifier, claim-auditor, remote-agent-dispatcher) #519

Preliminary Checks

Proposal scope

Proposed Agent Names

Domain Expertise

Unique Value Proposition

Primary Use Cases

Required Tools & Capabilities

Example Interactions

Your Expertise

Additional Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[NEW AGENT] claude-code-audit-stack — 3 adversarial verification subagents (bot-deploy-verifier, claim-auditor, remote-agent-dispatcher) #519

Description

Preliminary Checks

Proposal scope

Proposed Agent Names

Domain Expertise

Unique Value Proposition

Primary Use Cases

Required Tools & Capabilities

Example Interactions

Your Expertise

Additional Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions