Skip to content

feat(broker,worker): skip scope check for master-self (operator==actor)#195

Merged
hanwencheng merged 1 commit into
mainfrom
claude/master-self-scope-skip
Jun 4, 2026
Merged

feat(broker,worker): skip scope check for master-self (operator==actor)#195
hanwencheng merged 1 commit into
mainfrom
claude/master-self-scope-skip

Conversation

@hanwencheng
Copy link
Copy Markdown
Member

Why

A master operating on its own data (memory / credentials / email) shouldn't have to grant itself on-chain scope — the scope system exists to bound agents, not the operator over its own actor. Today both the broker cap-mint (handlers/cap.rs) and the worker re-verify (verify.rs::check_chain_scope) call isServiceInScope(operator, actor, service) unconditionally, so the master would need a self-grant ceremony. This removes that friction.

Unblocks the W3 master-self memory plant (#191) without a self-scope bootstrap step.

What

Skip isServiceInScope when operator_omni == actor_omni, symmetrically in:

  • broker handlers/cap.rsin_scope = if req_omni == req_actor { true } else { call_is_service_in_scope(...) }
  • worker verify.rs::check_chain_scope — early return Ok(()) when a == b

Both sides must agree (defense-in-depth) — otherwise the worker would reject a cap the broker minted.

Why it's bounded-safe (not a hole)

The per-actor device binding in the same flow already pins device.actor_omni == req.actor_omni, so a session can only ever mint for its own actor. The skip therefore only opens bots/<O_master>/… — it can never widen to an agent's or another operator's prefix. It's a deliberate skip, not a removal of the scope-grant mechanism (retained; a future design may re-introduce an explicit master-self grant).

Documented in arch.md §12.4.

Tests

  • check_chain_scope_skips_when_operator_is_actor — proves the skip fires before the RPC (passes an unreachable rpc_url; still Ok).
  • check_chain_scope_consults_chain_when_operator_differs — cross-actor still hits the chain (unreachable RPC → Err, never a silent pass).
  • 33 worker tests green; broker compiles.

Follow-on (issue-#90 test discipline)

The live harness/v2-stage3-demo.sh should gain a step asserting (a) a master-self cap mints with no scope grant, and (b) a cross-actor cap still returns ServiceNotInScope. That's live-chain-gated, so it's noted here as a required follow-on rather than run in this PR; the unit tests above cover the worker side deterministically.

🤖 Generated with Claude Code

The master owns its data classes (memory / credentials / email); scope gates
AGENTS, not the operator over its own actor. Skip isServiceInScope when
operator_omni == actor_omni, symmetrically in the broker cap-mint
(handlers/cap.rs) and the worker re-verify (verify.rs check_chain_scope) so
defense-in-depth agrees (else the worker would reject a broker-minted cap).

Bounded-safe: the per-actor device binding already pins device.actor_omni ==
req.actor_omni, so the skip only ever opens bots/<O_master>/ — never an agent's
or another operator's prefix, never widens cross-actor. Deliberate SKIP, not a
removal of the scope-grant mechanism (retained for a future design).

arch.md §12.4 documents the skip. Tests: worker check_chain_scope skips when
operator==actor (proven via unreachable rpc_url) + still consults the chain when
operator!=actor. 33 worker tests green; broker compiles.
@hanwencheng
Copy link
Copy Markdown
Member Author

Companion to #191 (this unblocks its master-self memory plant — no self-scope grant needed). The issue-#90 harness step noted in the description (assert a master-self cap mints with no scope grant + a cross-actor cap still returns ServiceNotInScope) is carried into #196, where the live on-chain master register makes it runnable end-to-end.

@hanwencheng hanwencheng merged commit 5bd3bf0 into main Jun 4, 2026
7 checks passed
hanwencheng added a commit that referenced this pull request Jun 5, 2026
…lant 400)

memory_put_real/memory_get_real send operator_omni + actor_omni = ctx.omni, sourced from the onboarding session omni which is stored BARE (no 0x). The broker cap-mint input-validates that operator_omni starts with 0x and 400s ('operator_omni must start with 0x') before normalizing — so the web plant failed AFTER the device was registered. Normalize ctx.omni to 0x once in real_memory_ctx (covers put + get); the broker normalize_hex32's it for the device-binding match, and master-self (operator==actor) hits the #195 skip so no scope grant is needed. cargo check -p agentkeys-daemon clean.
hanwencheng added a commit that referenced this pull request Jun 5, 2026
…200)

* feat: #164 broker-sponsored ERC-4337 master register + v2-demo harness restructure

Broker-sponsored, gas-free ERC-4337 master onboarding (#164 E6/E7): new broker 'sponsor' module — verifiable UserOp + VerifyingPaymaster.getHash encoding with an EIP-191 broker co-sign (pure functions, byte-exact with the live contracts, zero-gas read-only verification); lib.rs exports it. CLI gains k11 webauthn passkey keygen/sign + the sponsored-register flow; daemon ui_bridge wires the flow into the desktop UI.

Harness + docs: v2-demo.sh restructured into a single 5-phase front door (1-3 stages, 4 memory-plant, 5 wire) with PHASE.STEP addressing; sandbox auto-detect now probes the aiosandbox HTTP API (not a local openviking install). v2-stage3 agent-side steps (11-12/14-15) DEFER to the sandbox on the operator (green) with mock reserved for CI; clearer stale-broker guidance on the #195 master-self step. stage1/2 default to Touch-ID WebAuthn for operators, stub for CI. Adds harness/CLAUDE.md (harness rules extracted from root CLAUDE.md), operator runbooks (harness, web-memory), and erc4337 register/fund helpers.

* fix(broker-deploy): hard-reset to origin/$PULL_REF so --locked builds don't trip a stale Cargo.lock

setup-broker-host.sh --ref did 'git checkout -f' + 'git pull --ff-only', which can no-op against a stale local branch tip or leave a build-modified Cargo.lock on disk. A subsequent 'cargo build --locked' then fails with 'cannot update the lock file'. A deploy target must match origin EXACTLY — replace the ff-only pull with 'git reset --hard origin/$PULL_REF' (HEAD + index + working tree, Cargo.lock included). Idempotent.

* test(worker,harness): positive scope-delegation coverage (granted agent -> success)

Closes the gap where only the NEGATIVE/skip scope paths were asserted. Worker verify.rs: two new unit tests with an in-process std::net JSON-RPC mock (no new dep, Cargo.lock untouched) — check_chain_scope_ok_when_chain_grants (operator!=actor, chain returns true -> Ok) and check_chain_scope_rejects_when_chain_denies (false -> NotInScope). Harness v2-stage3: new standalone step 18 'POSITIVE: granted agent (operator!=actor) mints memory cap for the GRANTED service -> 200' — extracts the scope-grant assertion out of the steps 11-12 roundtrip; operator-authenticated mint (no agent key), defers on a §10.2 agent whose device isn't paired yet, mocks on CI. Completes the scope triad with step 16 (master-self skip) + step 17 (cross-actor denied). Cleanup renumbered 18->19, STEP_TOTAL=19. Runbook updated.

* fix(harness): memory-plant wallet/start uses {address,chain_id}, not {evm_address} (HTTP 422)

Phase 4 step 2 POSTed {evm_address:...} to /v1/auth/wallet/start, but the broker's WalletStartRequest requires {address: String, chain_id: u64} (both mandatory) — so axum rejected it with 422 'missing field address'. Reproduced live: {evm_address} -> 422, {address,chain_id} -> 200. Aligns memory-plant with the broker contract + the shape stage-1/stage-3/web-memory-bootstrap already use. Broker was correct; this was a stale client field name.

* fix(harness): memory-plant wallet/start sent 0x0x… deployer addr (HTTP 400 malformed address)

DEPLOYER_ADDR is already 0x-prefixed (cast wallet address output), but step 2 prepended another 0x -> '0x0x941cb1…' -> broker 400 'malformed address'. The wrong field name (422, prior fix) had masked this. Reproduced live: 0x0x… -> 400, 0x… -> 200. The omni (line 74) + cap-mint (line 98) already use the correct forms (broker hashes agentkeysevm+0x-addr, verified in omni_account.rs). Also switch the wallet start/verify curls from -sSf to -sS --fail-with-body so the broker's error JSON is shown on 4xx instead of an opaque 'curl: (NN) … error: CODE' (this step hid its cause twice).

* fix(harness,daemon): address codex adversarial review of phase 4/5

[high] Bash 3.2 (memory-plant-demo.sh): dropped the 'declare -A CAP' associative array (bash 4+; the operator platform is macOS bash 3.2.57 where it errors and CAP[$ns] under set -u is an unbound arithmetic var). Step 3 now just proves cap-mint per namespace; step 4 re-mints fresh (short-TTL). Verified runnable under 3.2.

[high] Partial plant (daemon ui_bridge.rs): the real-chain plant only failed when ZERO entries planted, so a partial (some namespaces succeed, one fails) returned 200 + audit + updated state. Now ANY durable-write failure returns 502 before the success audit/response; succeeded writes stay in master_memory so a re-plant is idempotent and resumes. 35 ui_bridge tests still pass.

[medium] Phase 5 skip (v2-demo.sh): an auto-skip (no aiosandbox) returned 0 and printed 'all green' + 'agent paired' — an unexecuted proof read as a pass. Now run_wire_phase records WIRE_RESULT (wired/skipped/disabled); an auto-skip reports 'v2-demo INCOMPLETE' and exits non-zero, the loop shows the phase as SKIPPED (not green ok), and the final pass/paired text only prints when the wire actually ran. --wire none is the explicit clean-skip escape (CI uses it). Also fixed the skip hint to the correct aiosandbox 'docker run …' (was the wrong openviking-sandbox-setup.sh). Runbook + harness/CLAUDE.md synced.

* fix(dev,daemon): web app runs the on-chain onboarding ceremony (not deferred) + real memory plant

dev.sh launched 'agentkeys-daemon --ui-bridge' with NO --register-master-script, so finish_chain_register hit its 'register_master_script = None' branch and silently SKIPPED the on-chain registerFirstMasterDevice (chain: none) — the ceremony was deferred while K11 enroll still reported success. It also passed no --memory-url/--memory-role-arn, so the plant button fell back to the in-memory RwLock instead of cap-mint → STS → worker → S3.

dev.sh now sources scripts/operator-workstation.env and ALWAYS passes --register-master-script (in-repo heima-register-first-master.sh; a missing deployer key / chain config now surfaces chain_error, never a silent skip), plus --memory-url/--memory-role-arn/--region when the env supplies them (real plant; logged). The daemon↔script arg contract was already correct (--operator-omni/--actor-omni/--k11-cose-hex/--k11-cred-id/--rp-id-hash) and real_memory_ctx sources the device hash from the K11-finish register, so the un-deferred ceremony is exactly what feeds the plant.

Name drift called out: the daemon's --memory-url env is AGENTKEYS_MEMORY_URL but operator-workstation.env spells it AGENTKEYS_WORKER_MEMORY_URL; bridged in dev.sh via the explicit flag (accepts either). Also un-stale the ui_bridge.rs module doc that still claimed the register is stubbed.

* fix(harness): v2-demo phase 5 runs the wire with --webauthn (grant the agent's memory scope)

Phase 5 ran 'phase1-wire-demo.sh --real' with NO --webauthn, so the wire's P.3 scope grant (heima-scope-set --webauthn) was SKIPPED — the §10.2 agent paired but the master never granted it the memory:<ns> scope. The agent's memory.get(travel) mints a cap for service 'memory:travel' (mcp-server/src/tools/memory.rs: format!("memory:{namespace}")), the broker checks isServiceInScope(O_master, agent, memory:travel) and returns service_not_in_scope -> Act1 (3.1) + inject (4.2) fail. Now auto + real pass --real --webauthn so the master grants memory:<ns> via Touch ID (one prompt, like phases 1-2; heima-scope-set is idempotent so re-runs skip). Service strings match (grant + cap both memory:<ns>); the master's K11 is enrolled+registered in phases 1-2 so setScopeWithWebauthn verifies. Runbook + harness/CLAUDE.md synced.

* fix(web): surface the daemon's real plant error, not a misleading 'Connect a daemon' toast

The /memory plant button only renders when the daemon is connected (status.kind==='connected'), so a plant failure is almost never 'no daemon' — yet plantDone's else-branch always showed 'Connect a daemon to plant prepared memory', masking the daemon's actual reason (which postJson already captured in r.status.detail, e.g. 409 'no master session — complete onboarding first' / 'master device not registered on chain yet', or a 502 worker error). Now it extracts + shows the real reason. tsc --noEmit clean.

* fix(daemon): cap-mint operator_omni must be 0x-prefixed (web memory plant 400)

memory_put_real/memory_get_real send operator_omni + actor_omni = ctx.omni, sourced from the onboarding session omni which is stored BARE (no 0x). The broker cap-mint input-validates that operator_omni starts with 0x and 400s ('operator_omni must start with 0x') before normalizing — so the web plant failed AFTER the device was registered. Normalize ctx.omni to 0x once in real_memory_ctx (covers put + get); the broker normalize_hex32's it for the device-binding match, and master-self (operator==actor) hits the #195 skip so no scope grant is needed. cargo check -p agentkeys-daemon clean.

* fix(web): plant toast shows plant-count vs list-count (expose cache-vs-store gap)

After a successful plant, plantDone read listMasterMemory but only setMemories on ok and toasted just the plant counts — so a daemon-cache miss (e.g. after a restart) silently showed an empty list. Now the toast shows '<planted> new … <list.length> in the memory view' (so 'N new but 0 in view' is visible) and surfaces a failed list GET. Note: GET /v1/master/memory reads the daemon IN-MEMORY cache, not S3 — so the list is empty after any daemon restart even though the data is durable in S3.

* feat(broker,worker): DataClass::Config + /v1/cap/config-{store,fetch} (Phase 0)

Phase 0 of docs/plan/web-flow/config-data-class-memory-list.md (lazy, config-driven memory list). Adds the DataClass::Config variant to both cap.rs + verify.rs (serializes 'config'), the broker cap_config_store/cap_config_fetch handlers (statically derive {op, data_class: Config}) + routes /v1/cap/config-store + /v1/cap/config-fetch. check_data_class is generic, so a Config cap is rejected by the cred + memory workers (and a memory cap by the config worker) — covered by new unit tests. Infra-free: the endpoints mint Config caps, but the config bucket/role/worker land in Phases 1-2. cargo check (broker) clean; 4 worker data_class tests pass.

* style: cargo fmt --all + fix clippy unusual_byte_groupings — green CI

harness-ci 'cargo fmt + clippy + test' failed at fmt: this PR's sponsor/webauthn/cli/daemon/verify code (committed earlier) wasn't rustfmt-clean, plus my new config routes in lib.rs. Ran cargo fmt --all (6 PR files reformatted, no unrelated drift). Also fixed clippy::unusual_byte_groupings in sponsor.rs:189 (0x0102_03 -> 0x010203, value-identical) that -D warnings rejected in the lib-test target. Verified locally: fmt --check clean, clippy --workspace --all-targets -- -D warnings exit 0, cargo test --workspace 40 results ok / 0 failed.

* feat(harness,daemon): web↔agent parity as v2-demo phase 6 + ui-bridge seed seam (W6)

Implements wire-real-paths W6 as a v2-demo PHASE, not a standalone script (no second front door, no re-bootstrap). daemon: add --ui-bridge-seed-session-jwt + --ui-bridge-seed-omni — seeds the ui-bridge onboarding session with the master's existing J1 + omni so the parity phase drives the REAL plant chain WITHOUT re-running interactive email/WebAuthn onboarding (pairs with the existing --master-device-key-hash). harness/web-parity-demo.sh = phase 6: reuses the preflight build + live chain/broker + the master registered in phases 1-2, boots agentkeys-daemon --ui-bridge SEEDED, plants a probe ns via POST /v1/master/memory/plant; a 200 proves the daemon's chain (cap-mint→STS→worker→S3) == the agent/harness chain — the web↔harness drift gate. Cost: one daemon boot + one plant, no re-build/re-chain/re-enroll; real-only (skips without a broker). Wired into v2-demo (default 1→6, --from/--stage/--only addressing). Docs synced (runbook, harness/CLAUDE.md, wire-real-paths W6). cargo fmt+clippy --workspace --all-targets clean; bash -n clean. NOTE: statically verified (compiles + wired + prereqs met); the live end-to-end smoke is bash harness/v2-demo.sh --stage 6 on real infra.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant