feat(broker,worker): skip scope check for master-self (operator==actor)#195
Merged
Conversation
The master owns its data classes (memory / credentials / email); scope gates AGENTS, not the operator over its own actor. Skip isServiceInScope when operator_omni == actor_omni, symmetrically in the broker cap-mint (handlers/cap.rs) and the worker re-verify (verify.rs check_chain_scope) so defense-in-depth agrees (else the worker would reject a broker-minted cap). Bounded-safe: the per-actor device binding already pins device.actor_omni == req.actor_omni, so the skip only ever opens bots/<O_master>/ — never an agent's or another operator's prefix, never widens cross-actor. Deliberate SKIP, not a removal of the scope-grant mechanism (retained for a future design). arch.md §12.4 documents the skip. Tests: worker check_chain_scope skips when operator==actor (proven via unreachable rpc_url) + still consults the chain when operator!=actor. 33 worker tests green; broker compiles.
Member
Author
|
Companion to #191 (this unblocks its master-self memory plant — no self-scope grant needed). The issue-#90 harness step noted in the description (assert a master-self cap mints with no scope grant + a cross-actor cap still returns |
This was referenced Jun 4, 2026
hanwencheng
added a commit
that referenced
this pull request
Jun 5, 2026
…lant 400)
memory_put_real/memory_get_real send operator_omni + actor_omni = ctx.omni, sourced from the onboarding session omni which is stored BARE (no 0x). The broker cap-mint input-validates that operator_omni starts with 0x and 400s ('operator_omni must start with 0x') before normalizing — so the web plant failed AFTER the device was registered. Normalize ctx.omni to 0x once in real_memory_ctx (covers put + get); the broker normalize_hex32's it for the device-binding match, and master-self (operator==actor) hits the #195 skip so no scope grant is needed. cargo check -p agentkeys-daemon clean.
hanwencheng
added a commit
that referenced
this pull request
Jun 5, 2026
…200) * feat: #164 broker-sponsored ERC-4337 master register + v2-demo harness restructure Broker-sponsored, gas-free ERC-4337 master onboarding (#164 E6/E7): new broker 'sponsor' module — verifiable UserOp + VerifyingPaymaster.getHash encoding with an EIP-191 broker co-sign (pure functions, byte-exact with the live contracts, zero-gas read-only verification); lib.rs exports it. CLI gains k11 webauthn passkey keygen/sign + the sponsored-register flow; daemon ui_bridge wires the flow into the desktop UI. Harness + docs: v2-demo.sh restructured into a single 5-phase front door (1-3 stages, 4 memory-plant, 5 wire) with PHASE.STEP addressing; sandbox auto-detect now probes the aiosandbox HTTP API (not a local openviking install). v2-stage3 agent-side steps (11-12/14-15) DEFER to the sandbox on the operator (green) with mock reserved for CI; clearer stale-broker guidance on the #195 master-self step. stage1/2 default to Touch-ID WebAuthn for operators, stub for CI. Adds harness/CLAUDE.md (harness rules extracted from root CLAUDE.md), operator runbooks (harness, web-memory), and erc4337 register/fund helpers. * fix(broker-deploy): hard-reset to origin/$PULL_REF so --locked builds don't trip a stale Cargo.lock setup-broker-host.sh --ref did 'git checkout -f' + 'git pull --ff-only', which can no-op against a stale local branch tip or leave a build-modified Cargo.lock on disk. A subsequent 'cargo build --locked' then fails with 'cannot update the lock file'. A deploy target must match origin EXACTLY — replace the ff-only pull with 'git reset --hard origin/$PULL_REF' (HEAD + index + working tree, Cargo.lock included). Idempotent. * test(worker,harness): positive scope-delegation coverage (granted agent -> success) Closes the gap where only the NEGATIVE/skip scope paths were asserted. Worker verify.rs: two new unit tests with an in-process std::net JSON-RPC mock (no new dep, Cargo.lock untouched) — check_chain_scope_ok_when_chain_grants (operator!=actor, chain returns true -> Ok) and check_chain_scope_rejects_when_chain_denies (false -> NotInScope). Harness v2-stage3: new standalone step 18 'POSITIVE: granted agent (operator!=actor) mints memory cap for the GRANTED service -> 200' — extracts the scope-grant assertion out of the steps 11-12 roundtrip; operator-authenticated mint (no agent key), defers on a §10.2 agent whose device isn't paired yet, mocks on CI. Completes the scope triad with step 16 (master-self skip) + step 17 (cross-actor denied). Cleanup renumbered 18->19, STEP_TOTAL=19. Runbook updated. * fix(harness): memory-plant wallet/start uses {address,chain_id}, not {evm_address} (HTTP 422) Phase 4 step 2 POSTed {evm_address:...} to /v1/auth/wallet/start, but the broker's WalletStartRequest requires {address: String, chain_id: u64} (both mandatory) — so axum rejected it with 422 'missing field address'. Reproduced live: {evm_address} -> 422, {address,chain_id} -> 200. Aligns memory-plant with the broker contract + the shape stage-1/stage-3/web-memory-bootstrap already use. Broker was correct; this was a stale client field name. * fix(harness): memory-plant wallet/start sent 0x0x… deployer addr (HTTP 400 malformed address) DEPLOYER_ADDR is already 0x-prefixed (cast wallet address output), but step 2 prepended another 0x -> '0x0x941cb1…' -> broker 400 'malformed address'. The wrong field name (422, prior fix) had masked this. Reproduced live: 0x0x… -> 400, 0x… -> 200. The omni (line 74) + cap-mint (line 98) already use the correct forms (broker hashes agentkeysevm+0x-addr, verified in omni_account.rs). Also switch the wallet start/verify curls from -sSf to -sS --fail-with-body so the broker's error JSON is shown on 4xx instead of an opaque 'curl: (NN) … error: CODE' (this step hid its cause twice). * fix(harness,daemon): address codex adversarial review of phase 4/5 [high] Bash 3.2 (memory-plant-demo.sh): dropped the 'declare -A CAP' associative array (bash 4+; the operator platform is macOS bash 3.2.57 where it errors and CAP[$ns] under set -u is an unbound arithmetic var). Step 3 now just proves cap-mint per namespace; step 4 re-mints fresh (short-TTL). Verified runnable under 3.2. [high] Partial plant (daemon ui_bridge.rs): the real-chain plant only failed when ZERO entries planted, so a partial (some namespaces succeed, one fails) returned 200 + audit + updated state. Now ANY durable-write failure returns 502 before the success audit/response; succeeded writes stay in master_memory so a re-plant is idempotent and resumes. 35 ui_bridge tests still pass. [medium] Phase 5 skip (v2-demo.sh): an auto-skip (no aiosandbox) returned 0 and printed 'all green' + 'agent paired' — an unexecuted proof read as a pass. Now run_wire_phase records WIRE_RESULT (wired/skipped/disabled); an auto-skip reports 'v2-demo INCOMPLETE' and exits non-zero, the loop shows the phase as SKIPPED (not green ok), and the final pass/paired text only prints when the wire actually ran. --wire none is the explicit clean-skip escape (CI uses it). Also fixed the skip hint to the correct aiosandbox 'docker run …' (was the wrong openviking-sandbox-setup.sh). Runbook + harness/CLAUDE.md synced. * fix(dev,daemon): web app runs the on-chain onboarding ceremony (not deferred) + real memory plant dev.sh launched 'agentkeys-daemon --ui-bridge' with NO --register-master-script, so finish_chain_register hit its 'register_master_script = None' branch and silently SKIPPED the on-chain registerFirstMasterDevice (chain: none) — the ceremony was deferred while K11 enroll still reported success. It also passed no --memory-url/--memory-role-arn, so the plant button fell back to the in-memory RwLock instead of cap-mint → STS → worker → S3. dev.sh now sources scripts/operator-workstation.env and ALWAYS passes --register-master-script (in-repo heima-register-first-master.sh; a missing deployer key / chain config now surfaces chain_error, never a silent skip), plus --memory-url/--memory-role-arn/--region when the env supplies them (real plant; logged). The daemon↔script arg contract was already correct (--operator-omni/--actor-omni/--k11-cose-hex/--k11-cred-id/--rp-id-hash) and real_memory_ctx sources the device hash from the K11-finish register, so the un-deferred ceremony is exactly what feeds the plant. Name drift called out: the daemon's --memory-url env is AGENTKEYS_MEMORY_URL but operator-workstation.env spells it AGENTKEYS_WORKER_MEMORY_URL; bridged in dev.sh via the explicit flag (accepts either). Also un-stale the ui_bridge.rs module doc that still claimed the register is stubbed. * fix(harness): v2-demo phase 5 runs the wire with --webauthn (grant the agent's memory scope) Phase 5 ran 'phase1-wire-demo.sh --real' with NO --webauthn, so the wire's P.3 scope grant (heima-scope-set --webauthn) was SKIPPED — the §10.2 agent paired but the master never granted it the memory:<ns> scope. The agent's memory.get(travel) mints a cap for service 'memory:travel' (mcp-server/src/tools/memory.rs: format!("memory:{namespace}")), the broker checks isServiceInScope(O_master, agent, memory:travel) and returns service_not_in_scope -> Act1 (3.1) + inject (4.2) fail. Now auto + real pass --real --webauthn so the master grants memory:<ns> via Touch ID (one prompt, like phases 1-2; heima-scope-set is idempotent so re-runs skip). Service strings match (grant + cap both memory:<ns>); the master's K11 is enrolled+registered in phases 1-2 so setScopeWithWebauthn verifies. Runbook + harness/CLAUDE.md synced. * fix(web): surface the daemon's real plant error, not a misleading 'Connect a daemon' toast The /memory plant button only renders when the daemon is connected (status.kind==='connected'), so a plant failure is almost never 'no daemon' — yet plantDone's else-branch always showed 'Connect a daemon to plant prepared memory', masking the daemon's actual reason (which postJson already captured in r.status.detail, e.g. 409 'no master session — complete onboarding first' / 'master device not registered on chain yet', or a 502 worker error). Now it extracts + shows the real reason. tsc --noEmit clean. * fix(daemon): cap-mint operator_omni must be 0x-prefixed (web memory plant 400) memory_put_real/memory_get_real send operator_omni + actor_omni = ctx.omni, sourced from the onboarding session omni which is stored BARE (no 0x). The broker cap-mint input-validates that operator_omni starts with 0x and 400s ('operator_omni must start with 0x') before normalizing — so the web plant failed AFTER the device was registered. Normalize ctx.omni to 0x once in real_memory_ctx (covers put + get); the broker normalize_hex32's it for the device-binding match, and master-self (operator==actor) hits the #195 skip so no scope grant is needed. cargo check -p agentkeys-daemon clean. * fix(web): plant toast shows plant-count vs list-count (expose cache-vs-store gap) After a successful plant, plantDone read listMasterMemory but only setMemories on ok and toasted just the plant counts — so a daemon-cache miss (e.g. after a restart) silently showed an empty list. Now the toast shows '<planted> new … <list.length> in the memory view' (so 'N new but 0 in view' is visible) and surfaces a failed list GET. Note: GET /v1/master/memory reads the daemon IN-MEMORY cache, not S3 — so the list is empty after any daemon restart even though the data is durable in S3. * feat(broker,worker): DataClass::Config + /v1/cap/config-{store,fetch} (Phase 0) Phase 0 of docs/plan/web-flow/config-data-class-memory-list.md (lazy, config-driven memory list). Adds the DataClass::Config variant to both cap.rs + verify.rs (serializes 'config'), the broker cap_config_store/cap_config_fetch handlers (statically derive {op, data_class: Config}) + routes /v1/cap/config-store + /v1/cap/config-fetch. check_data_class is generic, so a Config cap is rejected by the cred + memory workers (and a memory cap by the config worker) — covered by new unit tests. Infra-free: the endpoints mint Config caps, but the config bucket/role/worker land in Phases 1-2. cargo check (broker) clean; 4 worker data_class tests pass. * style: cargo fmt --all + fix clippy unusual_byte_groupings — green CI harness-ci 'cargo fmt + clippy + test' failed at fmt: this PR's sponsor/webauthn/cli/daemon/verify code (committed earlier) wasn't rustfmt-clean, plus my new config routes in lib.rs. Ran cargo fmt --all (6 PR files reformatted, no unrelated drift). Also fixed clippy::unusual_byte_groupings in sponsor.rs:189 (0x0102_03 -> 0x010203, value-identical) that -D warnings rejected in the lib-test target. Verified locally: fmt --check clean, clippy --workspace --all-targets -- -D warnings exit 0, cargo test --workspace 40 results ok / 0 failed. * feat(harness,daemon): web↔agent parity as v2-demo phase 6 + ui-bridge seed seam (W6) Implements wire-real-paths W6 as a v2-demo PHASE, not a standalone script (no second front door, no re-bootstrap). daemon: add --ui-bridge-seed-session-jwt + --ui-bridge-seed-omni — seeds the ui-bridge onboarding session with the master's existing J1 + omni so the parity phase drives the REAL plant chain WITHOUT re-running interactive email/WebAuthn onboarding (pairs with the existing --master-device-key-hash). harness/web-parity-demo.sh = phase 6: reuses the preflight build + live chain/broker + the master registered in phases 1-2, boots agentkeys-daemon --ui-bridge SEEDED, plants a probe ns via POST /v1/master/memory/plant; a 200 proves the daemon's chain (cap-mint→STS→worker→S3) == the agent/harness chain — the web↔harness drift gate. Cost: one daemon boot + one plant, no re-build/re-chain/re-enroll; real-only (skips without a broker). Wired into v2-demo (default 1→6, --from/--stage/--only addressing). Docs synced (runbook, harness/CLAUDE.md, wire-real-paths W6). cargo fmt+clippy --workspace --all-targets clean; bash -n clean. NOTE: statically verified (compiles + wired + prereqs met); the live end-to-end smoke is bash harness/v2-demo.sh --stage 6 on real infra.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
A master operating on its own data (memory / credentials / email) shouldn't have to grant itself on-chain scope — the scope system exists to bound agents, not the operator over its own actor. Today both the broker cap-mint (
handlers/cap.rs) and the worker re-verify (verify.rs::check_chain_scope) callisServiceInScope(operator, actor, service)unconditionally, so the master would need a self-grant ceremony. This removes that friction.Unblocks the W3 master-self memory plant (#191) without a self-scope bootstrap step.
What
Skip
isServiceInScopewhenoperator_omni == actor_omni, symmetrically in:handlers/cap.rs—in_scope = if req_omni == req_actor { true } else { call_is_service_in_scope(...) }verify.rs::check_chain_scope— earlyreturn Ok(())whena == bBoth sides must agree (defense-in-depth) — otherwise the worker would reject a cap the broker minted.
Why it's bounded-safe (not a hole)
The per-actor device binding in the same flow already pins
device.actor_omni == req.actor_omni, so a session can only ever mint for its own actor. The skip therefore only opensbots/<O_master>/…— it can never widen to an agent's or another operator's prefix. It's a deliberate skip, not a removal of the scope-grant mechanism (retained; a future design may re-introduce an explicit master-self grant).Documented in arch.md §12.4.
Tests
check_chain_scope_skips_when_operator_is_actor— proves the skip fires before the RPC (passes an unreachablerpc_url; stillOk).check_chain_scope_consults_chain_when_operator_differs— cross-actor still hits the chain (unreachable RPC →Err, never a silent pass).Follow-on (issue-#90 test discipline)
The live
harness/v2-stage3-demo.shshould gain a step asserting (a) a master-self cap mints with no scope grant, and (b) a cross-actor cap still returnsServiceNotInScope. That's live-chain-gated, so it's noted here as a required follow-on rather than run in this PR; the unit tests above cover the worker side deterministically.🤖 Generated with Claude Code