Releases: luiseiman/dotforge
v3.7.1 — Evidence-based compaction policy (80% threshold)
Política de compactación basada en evidencia
Investigación combinando academia (Liu et al. Stanford 2023, Chroma Research, Greg Kamradt) y práctica de campo en X (Boris Cherny y Cat Wu de Anthropic, Daniel San, Avthar, Paweł Huryn) consolidada en política operacional dotforge. Threshold canónico: 80% del context window (no 50%, no 96.7% default).
Nuevas piezas
domain/compaction-strategy.md(nueva domain rule, 70 líneas) — política basada en evidencia: threshold 80%, distinción/compactvs/clearvs subagent, anti-patterns confirmados, cache economy (Paweł Huryn — invalidación de prefix cache), re-anchoring para mitigar "lost in the middle" (Liu et al.). Citas con URLs./forge compact-task(nuevo slash command) — wrapper de/compactcon hint estandarizado dotforge: preserve decisions, files modified, pending TODOs, behaviors disabled, last commit; drop tool output verbose. Resuelve el anti-pattern de/compactsin instructions custom./forge context-status(nuevo slash command) — reporte read-only sobre uso estimado del context window, cache health proxy (basado entool-latency.shp50), edits recientes, recomendación de acción. No compacta.pre-compact-warning.sh(nuevo hook, wired enUserPromptSubmit) — alerta proactiva al 80% (warning) y 90% (urgent). Estimación: bytes del transcript / 5. Configurable via env vars:CLAUDE_CONTEXT_LIMIT,CLAUDE_COMPACT_WARN_PCT,CLAUDE_COMPACT_URGENT_PCT. Smoke-tested en 3 escenarios (debajo del threshold, warning, urgent).docs/internal/compaction-strategy.md(~200 líneas) — guía operacional canónica con flow chart ASCII, tabla de decisión/compactvs/clearvs subagent, configuración por tipo de proyecto (light/standard/heavy), bibliografía completa (académica + X).
Wiring
.claude/settings.json: nuevo bloqueUserPromptSubmitconpre-compact-warning.sh(timeout 3s)template/settings.json.tmpl: idem para propagación a 12 proyectostemplate/hooks/pre-compact-warning.sh: copia propagable
Hallazgos clave de la investigación
- Liu et al. (Stanford 2023): 30%+ accuracy loss para info en el medio del contexto (7-50% depth)
- Chroma Research (2024): 18 modelos frontier degradan dentro de su ventana declarada — Sonnet 200K muestra caídas desde 50K tokens
- Greg Kamradt: GPT-4 recall degrada >73K tokens
- Boris Cherny (Anthropic): auto-compact dispara ~155K tokens; plan acceptance auto-clears context
- Cat Wu (Anthropic): defiende auto-compact como preservador de info crítica
- Daniel San (X): mantiene auto-compact OFF, hook al 80% en producción ("every time it triggered for me, I lost important context")
- Avthar (X): "actively clear context yourself using /clear or /compact rather than waiting for auto-compact to happen mid-task, which can hurt performance"
- Paweł Huryn (X): cache economics dominan la decisión — bug de marzo 2026 causó 20× cost inflation por cache roto
Convergencia de la evidencia
| Threshold | Veredicto |
|---|---|
| 50% | Demasiado agresivo. Pérdida de thread reciente, summary acumula degradación |
| 80% | Sweet spot. Coinciden Daniel San, Avthar, evidencia académica con safety margin |
| 96.7% (default) | Demasiado tarde. Calidad ya degradada al disparar auto-compact |
v3.7.0 — Smart init — startup snapshot + drift + Setup validation
Init inteligente — startup snapshot + drift detection + Setup validation
Cuatro piezas nuevas que cierran la simetría con auto-compact (v3.6.3): el SessionStart ahora captura, compara y persiste el estado inicial; el Setup hook valida invariantes antes de cualquier tool call.
Nuevos hooks
-
.claude/hooks/session-startup.sh(wired enSessionStart, todos lossource ≠ compact):- Captura branch, HEAD short, working tree count, archivos
.claude/editados en últimas 24h, TODOs/FIXMEs pendientes, behaviors deshabilitados - Compara HEAD actual con el HEAD del último snapshot en
startup-history/→ emite "drift" line con commits-ahead - Escribe
.claude/session/last-startup.md(snapshot completo) +startup-history/<ISO>.md(rotating, últimos 5) - Inyecta brief al stdout (Claude lo recibe como contexto inicial) SOLO si hay algo notable: tree dirty, recent edits, drift, behaviors off, TODOs pending. Silencioso si todo limpio
- Silencioso en
source=compact(delegado asession-restore.sh)
- Captura branch, HEAD short, working tree count, archivos
-
.claude/hooks/pre-session-check.sh(wired enSetup, matchersinitymaintenance):- Valida invariantes en
claude --init-only/claude --maintenance:settings.jsones JSON válidoblock-destructive.shpresente + ejecutable (security baseline)behaviors/index.yamles YAML válido (si existe)- Todos los hooks wireados existen y son ejecutables
DOTFORGE_DIRresuelve (warn only)
- Exit 2 bloquea session start si hay errores críticos
- Output: silencioso en éxito, checklist completo en fallo
- Valida invariantes en
Cambios
template/settings.json.tmpl— nuevos hooks wireados:SessionStartagrega tercer entry:session-startup.sh(timeout 10s)Setupcon matchersinitymaintenanceapunta apre-session-check.sh
template/hooks/session-startup.shytemplate/hooks/pre-session-check.sh— copias propagables a los 12 proyectos en próximo/forge syncdomain/hook-events.md— documenta el wiring de dotforge enSessionStart(3 hooks) ySetup(pre-session-check). Reflejaenabledenindex.yamlpara los 3 hooks y los matchers para Setup.
Verificación
Smoke tests sobre el proyecto dotforge mismo:
$ printf '{"source":"startup"}' | bash .claude/hooks/session-startup.sh
## Session Startup Brief
**Branch:** main @ fffc0b6
**Working tree:** 6 changed files
**Recent .claude/ edits (24h):** 14
**Behaviors disabled:** search-first,plan-before-code,objection-format
$ bash .claude/hooks/pre-session-check.sh
✓ dotforge pre-session check: all invariants pass
$ printf '{"source":"compact"}' | bash .claude/hooks/session-startup.sh
(silent — delegated to session-restore.sh)
$ # Inject broken hook reference, run check
$ bash .claude/hooks/pre-session-check.sh
── dotforge pre-session check ──
Errors (1):
✗ Wired hook missing: .claude/hooks/nonexistent.sh
─────────────────────────────────
exit=2
Lo que cierra de la auditoría inicial
| Pre-v3.7.0 | Post-v3.7.0 |
|---|---|
SessionStart sólo en compact re-inyecta contexto |
Cubre los 4 sources (startup, resume, compact, clear) |
| No detección de drift entre sesiones | session-startup.sh compara HEAD vs último snapshot |
| Setup hook nunca wireado pese a estar documentado | pre-session-check.sh valida invariantes en --init-only |
| No histórico de session starts | startup-history/ rotating, último 5 |
| Sin visibilidad de behaviors disabled al arrancar | Brief incluye lista explícita |
v3.6.3 — Smart auto-compact — filter + rotating history
Auto-compact inteligente — filtrado y histórico
Capa de filtrado encima del compact_summary que genera Claude Code. Dos mejoras concretas al pipeline existente (PostCompact → last-compact.md → SessionStart restore):
scripts/compact-filter.py(nuevo) — pipe filter conservador que reduce el summary antes de persistirlo. Heurísticas seguras:- Bloques fenced (
```) >40 líneas → primer 5 + último 5 + nota de elisión - Runs de ≥30 líneas no-protegidas (sin markdown structure, sin paths, sin keywords decisión/error/fix/pending) → primer 3 + último 3 + nota
- Paragraphs duplicados ≥3 veces → 1 sola copia
- Runs de >2 newlines consecutivos → colapso a 2
- Nunca filtra: lineas con
#/-/|/>/=, paths (.md/.sh/.py/etc.), tokens críticos (decision,error,fix,pending,next step,commit,todo,blocker,warning,fail), primeras 10 líneas - Output a stdout, métricas (in/out bytes, ratio) a stderr
- Tests: 2253B → 730B = 68% reducción sobre summary verbose; 22453B → 22447B = ~0% sobre summary ya denso (no daña).
- Bloques fenced (
.claude/hooks/post-compact.sh+template/hooks/post-compact.sh— pipe summary por compact-filter, con fallback al raw si el filter falla. Métrica[compact-filter]queda en el frontmatter del checkpoint.- Histórico rotatorio — últimas 5 compactaciones bajo
.claude/session/compact-history/<ISO>.md. Permite diff entre compactaciones consecutivas o recovery silast-compact.mdquedó stale. domain/context-window-optimization.md— actualizado con la nota del nuevo comportamiento del hook.
Verificación
Smoke test end-to-end con JSON sintético (40 líneas filler + decision + next steps):
[compact-filter] in=2253B out=730B saved=1523B ratio=0.32
Sobre last-compact.md real de la sesión actual (22 KB de summary denso):
[compact-filter] in=22453B out=22447B saved=6B ratio=1.00
Comportamiento esperado: summaries densos pasan casi sin tocar, summaries con tool dumps verbose se reducen 30-70%. Worst case el archivo queda igual — el filter es seguridad, no compresión agresiva.
v3.6.2 — Audit follow-ups — signal gate, metric rename, parallel-sessions split
Cierre de pendientes de auditoría
Aplicados los 4 pendientes registrados en v3.6.1.
hooks/detect-claude-changes.sh— gate de señal: skip auto-stub si TOTAL < 15 archivos AND no hay cambios estructurales (agents/commands/skills = 0). Elimina ruido de inbox que el usuario no podía evaluar.- Métrica honesta —
not-applicable→informationalenpractices/metrics.yml(35 entradas),skills/update-practices/SKILL.md,practices/active/*.md(11 frontmatters),practices/README.md,docs/config-validation.md,docs/internal/config-validation-flow.md. Validation rate ahora calcula sobre 19 prácticas trackable (no sobre 54), produciendo 0/19 = 0% — métrica realista, no inflada por información general. registry/projects.yml— header reescrito como "EXAMPLE / REFERENCE FILE" explícito. Aclara que la fuente de verdad esprojects.local.yml(gitignored) y por qué hay dos archivos.domain/parallel-sessions.md— 81 → 38 líneas. Las secciones de CLI flags no relacionadas con paralelismo migraron al nuevodomain/cli-flags.md(53 líneas), con globs distintos (CLAUDE.md,agents/*,skills/**/SKILL.md,scripts/**/*.sh,.github/workflows/*.yml) — cargan según contexto distinto.
Domain rules > 50 líneas tras este pase (8 restantes, no críticas)
| Archivo | Líneas | Sobre |
|---|---|---|
| rule-effectiveness.md | 68 | +18 |
| hook-architecture.md | 63 | +13 |
| auto-mode.md | 62 | +12 |
| permission-managed-settings.md | 60 | +10 |
| permission-model.md | 59 | +9 |
| agent-orchestration.md | 59 | +9 |
| context-control-patterns.md | 54 | +4 |
| cli-flags.md | 53 | +3 |
| plugin-distribution.md | 52 | +2 |
| context-window-optimization.md | 52 | +2 |
Diminishing returns: trim de wording en próxima iteración sin fragmentación adicional.
v3.6.1 — Audit fixes — search-first off, permission-model split
Auditoría crítica + pulidos de calidad
Sesión de auditoría a conciencia detectó tres degradaciones reales y se aplicaron las correcciones baratas + alto retorno.
Cambios
behaviors/index.yaml—search-first.enabled: false. Evidencia: counter=7, escaló asoft_blocky el usuario lo desactivó manualmente en sesión. Diseño actual (flag se consume tras cada Write/Edit) genera falsos positivos en sesiones tras compactación o con contexto ya cargado vía Read inicial. Revisitar cuando exista modo "sticky-flag".- Hooks generados de
search-firstremovidos — eliminados de.claude/hooks/generated/y desettings.json. PreToolUse: 8 → 6 hooks. Latencia neta menor; los hooks restantes (block-destructive,no-destructive-git,respect-todo-state×2,verify-before-done×2) siguen activos. domain/permission-model.mddividido — 112 líneas → 59 (core: modes, cascade, prefix detection, core rules, auto-approvals tightening, glob/grep platform note). El nuevodomain/permission-managed-settings.md(60 líneas) absorbe enterprise managed settings, MCP server config y dynamic permissions from hooks. Globs distintos (managed-settings.json, .mcp.json) → cargan sólo cuando aplican.- Limpieza de filesystem — borrados 9 backups huérfanos
settings.json.bak.20260428-*(dotforge + 8 proyectos), worktree zombireverent-banzaiya no aparecía.
Auditoría — qué SÍ aporta valor (con evidencia)
block-destructive.sh: activo en 12 proyectos, nunca desactivado, intercepta patrones nuevos (find -delete,xargs rm)- Fix
session-report.sh(v3.3.1): corrigió bug silencioso de 5 meses en métricas tool-latency.sh: datos llegando — Bash p50=53ms, Edit p50=11ms (hooks no son cuello de botella)- Domain rules con globs específicos: cargan sólo cuando aplican
scripts/audit_all.py+sync_all.py+wire_hooks_all.py: real automatización 12× → 1×
Auditoría — pendientes (no críticos)
- 8 domain rules siguen >50 líneas (propio límite). Acumular en próximo refactor sin urgencia.
practices/metrics.yml: 35/54 =not-applicable. Métrica engañosa — "validated" debería significar "previno error", no "5 ciclos sin pasar nada". Renombrar campo ainformationaly excluir de validation rate.- Registry shadow:
projects.yml(committed, ejemplo) vsprojects.local.yml(gitignored, real) — aclarar en docs. inbox/*-session-changes.mdautomáticos sin detalle son ruido. Filtrar en post-session hook si sólo son conteos.
v3.6.0 — Sync from CC v2.1.120-128 round 2 + Setup hook coverage
Sync from Claude Code v2.1.120-128 — round 2 (deeper coverage)
Seven practices captured this morning from a fresh /forge watch pass — all incorporated. One auto-stub rejected. Inbox: 0.
Domain rule updates
-
domain/hook-architecture.md— addedSetupevent to Session-level cadence (32 events total now). DocumentsSetuplifecycle: fires for--init-only/--maintenanceruns with matchersinitandmaintenance, distinct fromSessionStart(every session) — Setup only fires on explicit request. Also added design tradeoff note forPostToolUse.updatedToolOutput: now works for ALL tools (v2.1.121+), not just MCP — but rewriting can hide errors and breaks audit trail. PreferadditionalContextfor augmentation; reserveupdatedToolOutputfor redaction or compression. -
domain/hook-events.md— generalizedupdatedToolOutputfrom MCP-only to all tools (v2.1.121+). NewSetupevent payload: matchersinit|maintenance, non-blockable, used for credential rotation / env-var provisioning / prerequisite checks BEFORE session starts. -
domain/permission-model.md— added 5 managed-only enterprise fields (allowManagedPermissionRulesOnly,network.allowManagedDomainsOnly,filesystem.allowManagedReadPathsOnly,strictKnownMarketplaces,blockedMarketplaces) pluspluginTrustMessagefor org-specific guidance. New MCP server config section consolidatingenableAllProjectMcpServers,enabledMcpjsonServers, etc., with newalwaysLoad: trueoption (v2.1.121+) that bypasses tool-search deferral per server, andworkspacereserved name (v2.1.128+). -
domain/rule-effectiveness.md— new Runtime placeholders in skill content (v2.1.120+) section:${CLAUDE_EFFORT}resolves to active effort tier in skill markdown body (not just frontmatter). New Settings fields worth knowing (beyond permissions) section:availableModels,effortLevel,defaultShell,viewMode,enableWeakerNestedSandbox,pluginTrustMessage. -
domain/parallel-sessions.md—--init-only/--maintenanceflag now cross-referencesSetuphook (matchersinit/maintenance) and points tohook-events.md.
Docs
-
docs/usage-guide.md— new section 5b. CI / automation coveringclaude ultrareview [target]non-interactive code review (v2.1.120+, exit 0/1 contract,--json/--timeoutflags, GitHub Actions sketch withclaude setup-token). Subprocess attribution note:AI_AGENT=claude-codeauto-set in subprocesses for platforms that surface it. -
docs/best-practices.md— new Minor tooling tips (v2.1.120-128) subsection batching:--plugin-dir .zip,claude plugin prune/--prunecascade,AI_AGENTsubprocess env,ANTHROPIC_BEDROCK_SERVICE_TIER,--channelsAPI-key auth (channelsEnabled: truerequirement),workspacereserved MCP name,claude install [version|stable|latest]for CI pinning. -
integrations/channels/README.md— added API-key auth note (v2.1.128+): console / API-key users must setchannelsEnabled: true; Claude.ai sessions don't need this flag.
Practices
- 7 practices moved
inbox/ → active/, frontmatterincorporated_in: ['3.6.0']. - 1 rejected (
invisigtht-session-changes— auto-stub, summary-only). - Inbox: 0 pending.
metrics.yml: 1 newmonitoring(posttooluse-updated-output-all-tools — error_type=logic), 6not-applicable.
v3.5.0 — Sync from Claude Code v2.1.120-128 + agent memory checklist
Sync from Claude Code v2.1.120 → v2.1.128 + agent memory checklist
Six practices incorporated. Three security-relevant (monitoring), one auto-stub rejected.
New domain rule
.claude/rules/domain/plugin-distribution.md— covers${CLAUDE_PLUGIN_DATA}(v2.1.126+ persistent state for plugins surviving updates),CLAUDE_CODE_PLUGIN_SEED_DIRmulti-dir layered overlays (base + corporate + personal), managed marketplace governance (strictKnownMarketplaces,blockedMarketplaces,allowManagedPermissionRulesOnly,pluginTrustMessage), reserved server names (workspacesince v2.1.128), and lifecycle hygiene (claude plugin prune,--plugin-dir .zip).- Migration of dotforge's
practices/metrics.ymlandinbox/to${CLAUDE_PLUGIN_DATA}is documented as a candidate but explicitly out of scope this release (multi-commit work).
Skill / docs / agent updates
skills/reset-project/SKILL.md— new Step 5b suggestingclaude project purge $PWDpost-reset (v2.1.126+) to drop orphaned transcripts, task lists, and~/.claude.jsonentry. Verifies CLI availability before suggesting; never runs automatically.docs/usage-guide.md— new "Layered distribution (multi-seed)" subsection coveringCLAUDE_CODE_PLUGIN_SEED_DIRoverlay pattern; new "PR review flow tip" noting/resumeaccepts pasted PR URLs (v2.1.122+, GitHub/Enterprise/GitLab/Bitbucket).docs/security-checklist.md— new "--dangerously-skip-permissionstradeoffs (v2.1.121+)" subsection documenting that the flag now bypasses prompts for.claude/skills,agents,commandswrites, with explicit warning against pairing with prompts that include unverified content (injection vector that can now write to template files unprompted).agents/{architect,code-reviewer,implementer,security-auditor}.md— appended a "Memory persistence" section to each agent prompt with concrete checklist on when (and when not) to write to.claude/agent-memory/<agent>.md. Targets theagent-memory-underusedfinding from/forge insights2026-04-21 (≤2 entries per agent across 5 months).
Practices
- 6 practices moved
inbox/ → active/, frontmatterincorporated_in: ['3.5.0']. - 1 rejected (
tradingbot-session-changes— auto-stub, summary-only). - Inbox: 0 pending.
metrics.yml: 4 newmonitoringentries (plugin-data-variable, claude-project-purge, skip-permissions-claude-paths, agent-memory-underused), 2not-applicable.
Verified against
- Claude Code v2.1.128 (latest as of 2026-05-04). Watch-upstream pass surfaced additional v2.1.120-128 deltas captured for next cycle (PostToolUse.updatedToolOutput for all tools, Setup hook event, alwaysLoad MCP option, claude ultrareview, ${CLAUDE_EFFORT} placeholder, missing settings fields).
v3.4.1 — Backtesting ADR gate rule
New rule — stacks/trading/rules/backtesting-adr-gate.md
Captured from a real ADR retrospective in the tradingview repo: a "Dual Momentum SPY/QQQ/BIL 12m" strategy was declared the official baseline of the passive-US sleeve based on walk-forward OOS Sharpe 1.08 vs QQQ B&H 1.04 (delta = +0.04) and Calmar 2.78 vs 1.66. After fixing a look-ahead bug in the rebalancer, the OOS metrics deflated to 1.06 vs 1.04. Computing PSR(QQQ B&H) per Bailey & López de Prado (2012) for all 9 strategies tested in the repo showed none passed the 0.95 threshold — the "best" strategy gave 70% probability of beating B&H, i.e. 30% probability of being worse.
The new rule encodes:
- PSR(benchmark) > 0.95 required to claim "baseline", "winner", or "supersedes" in any ADR
- DSR (Deflated Sharpe Ratio) required when testing > 5 strategies in the same project (multiple-testing correction)
- Below threshold: ADR may document the strategy as alternative, but must not use the strong words
- Implementation: ~50 lines stdlib-only via
statistics.NormalDist; no scipy needed
Generalization beyond trading: when ranking N options by a noisy metric, compute Pr(top option genuinely better than alternatives). Below threshold, the ranking is decoration — don't anchor decisions on it.
Changed
stacks/trading/plugin.json: bumped to v2.1.0, components.rules now lists both rules.practices/active/2026-04-27-psr-gate-baseline-adrs.md: incorporated_in['3.4.1'].metrics.yml: monitoring entry, error_type=logic.
Inbox processing
- 1 accepted (psr-gate-baseline-adrs → above)
- 1 rejected (tradingview-session-changes — auto-stub, summary-only)
- 1 deferred (agent-memory-underused — low-priority, needs more usage data to evaluate)
v3.4.0 — Sync from Claude Code v2.1.92→v2.1.119 + audit/behavior fixes
Highlights
/forge watch pass against code.claude.com covering Claude Code v2.1.92 → v2.1.119. 14 practices accepted, 6 auto-generated rejects, 1 deferred.
Domain rules refreshed
- Hook event catalogue → 33+ with
UserPromptExpansion(slash-command expansion, blockable) andPostToolBatch(end-of-batch validation, blockable). Documentedmcp_toolas a fifth hook type with\${tool_input.*}substitution (v2.1.118+).PostToolUse/PostToolUseFailurenow carryduration_ms(v2.1.119+).UserPromptSubmitcan returnhookSpecificOutput.sessionTitle(v2.1.94+). - Auto mode
\"\$defaults\"placeholder (v2.1.118+) — extendsautoMode.allow|soft_deny|environmentinstead of replacing them. Removes the all-or-nothing trade-off when shipping custom rules. - Permission tightening (v2.1.113+):
Bash(find:*)allow rules no longer auto-approve-exec/-delete; deny rules now matchenv/sudo/watch/ionice/setsidwrappers; macOS/private/{etc,var,tmp,home}are dangerous removal targets underBash(rm:*). - Native macOS/Linux builds (v2.1.117+) replace
Glob/Grepwith embeddedbfs/ugrepviaBash.Glob(...)/Grep(...)permission specifiers and hook matchers are now platform-dependent. - TUI + idle-return recap:
tuisetting +/tuitoggle (v2.1.110+);awaySummaryEnabled+/recap(v2.1.108+, default-on for telemetry-disabled deployments since v2.1.110). Coexists with dotforge'slast-compact.md— different problems (idle return vs compaction survival). - Git attribution refresh:
attribution.commit/attribution.prsupersedeincludeCoAuthoredBy;prUrlTemplatefor self-hosted GitHub/GitLab/Bitbucket. - CLI surface fully documented in
domain/parallel-sessions.md: 16+ flags and 6 subcommands (claude install,auth,agents,auto-mode,remote-control,setup-token).
Operational fixes
- Audit checklist item 14 — scoring v3 behavior coverage now requires ENFORCEMENT (compiled hook under
.claude/hooks/generated/AND asettings.jsonreference), not just abehaviors/index.yamldeclaration. Closes the false-positive that scored projects 1/1 with no runtime effect. verify-before-doneregex extended to matchbash tests/*.sh,bash <path>/test-*.sh,./tests/*.sh. Fixes legitimategit pushfrom dotforge being soft-blocked afterbash tests/test-*.shruns. Recompiled hook included so the change takes runtime effect (caveat: behavior YAML edits are inert until `scripts/compiler/compile.sh` is rerun).docs/claude-vs-forge.md: `/usage` is the canonical command (`/cost` and `/stats` are aliases since v2.1.118).
Verified
33/33 tests pass — 19 skills + 8 runtime + 1 compiler + 5 behavior CLI.
Catch-up note
Latest published release was v3.0.0 (2026-04-13). Versions v3.1.0 → v3.3.1 shipped between releases — see docs/changelog.md for the full intervening history.
Full diff: v3.0.0...v3.4.0
v3.0.0 — Behavior Governance
dotforge v3.0.0 — Behavior Governance
dotforge v3 ships a runtime behavior governance layer on top of the v2.9 configuration layer. Behaviors are declarative policies on tool calls, compiled to `PreToolUse` hooks that share a session-scoped state file. Opt-in and non-breaking — v2.9 projects run unchanged.
What's new
- Behavior catalogue (`behaviors/`): `no-destructive-git`, `search-first`, `verify-before-done`, `respect-todo-state` (core, on by default) + `plan-before-code`, `objection-format` (opinionated, opt-in)
- Declarative DSL: `behavior.yaml` with closed field/operator set, 5-level escalation (silent → nudge → warning → soft_block → hard_block), flag-based temporal gating, template rendering
- Compiler: YAML → bash hooks + `settings.json` snippet. Conditions enforced at runtime via `regex_match`, `contains`, `starts_with`, …
- Runtime: mkdir-based locking, TTL 24h, counters, flags, pending_block reinvocation detection for override audit
- CLI: `/forge behavior list | describe | status | on | off | strict | relaxed` with project and session scopes
- Audit dimension 14: `/forge audit` now scores v3 behavior coverage (0-1)
- 33 tests green across runtime, compiler, CLI, and per-behavior scenarios
Example
```
Bash(git push origin main --force)
PreToolUse:Bash hook returned blocking error
PreToolUse:Bash says: Destructive git operation blocked: force push,
hard reset, clean -f, and forced branch delete
are not allowed.
Error: Hook PreToolUse:Bash denied this tool
```
Spec of record
- `docs/v3/SPEC.md` — evaluation algorithm, level table
- `docs/v3/SCHEMA.md` — `behavior.yaml v1`
- `docs/v3/RUNTIME.md` — state.json, locking, TTL, flags
- `docs/v3/MIGRATION.md` — v2.9 → v3 upgrade path
- `docs/changelog.md` — full release notes
Breaking changes
None. v3 is purely additive. A v2.9.1 project upgraded to v3.0.0 continues to work with zero changes until the user explicitly creates `behaviors/` and wires compiled hooks into `settings.json`.
Test suite
- runtime: 8
- compiler: 1
- CLI: 5
- search-first: 5
- no-destructive-git: 2
- verify-before-done: 3
- respect-todo-state: 2
- plan-before-code: 3
- objection-format: 2
Total: 33 tests green (up from 18 in v3.0.0-alpha.1).