ci(harness): guard the stage-3 step-21 config-worker skip against silent permanence (#209)#210
Merged
Merged
Conversation
3 tasks
…mo structure #204 (#203 backend-client refactor) merged after #205 and restructured harness-e2e from three v2-stage{1,2,3}-demo.sh steps into one v2-demo.sh --ci orchestrator — which already carried forward the config-worker-unreachable allow-skip. Re-apply the codex-review guard (PR #210) onto that new structure: a self-dissolving Guard step after the v2-demo run that warns while config-test is unprovisioned (#209) and FAILS once it becomes reachable, so the step-21 isolation skip can't silently persist. Resolves the #210<->main conflict; the --allow-skip invocation is unchanged from main.
29b9837 to
a931f47
Compare
Member
Author
|
Rebased onto current main (9ff3f8a) to resolve the conflict with #204 (which merged after #205 and restructured harness-e2e from three |
hanwencheng
added a commit
that referenced
this pull request
Jun 6, 2026
Conflicts resolved: - ui_bridge.rs route block: kept #204's MASTER_MEMORY_PLANT_ROUTE constant + my new /v1/master/config/{presets,init} + /v1/master/classify/{tag,propose} routes. My inline classify/config cap-mint paths coexist with #204's agentkeys-backend-client (the #201 config/memory plant paths #204 deliberately kept inline — same pattern). - harness-ci.yml: kept #210's v2-demo.sh --ci structure + the #209 step-21 quarantine guard; added classify-not-configured,classify-worker-unavailable to --allow-skip. - Cargo.lock: took main's + cargo regenerated to include agentkeys-catalog + agentkeys-worker-classify alongside agentkeys-backend-client. Post-merge green: workspace build + clippy clean; daemon 100 + broker 36 + catalog/ worker tests; frontend tsc + next build; both #203/#204 drift checks pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Question
Should harness-CI stage-3 step 21 (the config worker's live
cap_data_class_mismatchrejection) be allow-skipped viaconfig-worker-unreachable(added in #205)? Asked for a/codex:adversarial-reviewsecond opinion, challenging both the skip and the un-skip.Codex verdict —
needs-attention, cuts both waysconfig-worker-unreachabledoesn't restore the proof — it just hard-fails main CI on a known-unprovisionedconfig-testDNS/TLS dependency (Provision config-test (DNS + cert) so stage-3 step 21 is a live gate again (drop config-worker-unreachable allow-skip) #209), reddening every run without fixing anything.${AGENTKEYS_WORKER_CONFIG_URL}/v1/config/putwith a cross-class cap. Step 20 (memory/cred workers reject a config cap) +agentkeys-worker-configunit tests verify the handler logic, but cannot catch deployed route/nginx/TLS/BROKER_CAP_PUBKEY/middleware drift — which only step 21 catches. So the skip must not become silently permanent.Resolution (this PR)
Keep the quarantined skip (CI stays green — the host genuinely isn't up), but make the quarantine honest, not silent via a new Guard step that probes
config-test.<zone>/healthz:::warning::every run (visible, links #209) → exit 0, CI green::error::→ exit 1, forcing removal of the now-unnecessary allowance so step 21 runs as a live gate::notice::→ exit 0 (guard inert — self-dissolving)The
--allow-skipline itself is unchanged vs main; the diff is the Guard step + an expanded comment. The guard's three branches are unit-simulated (see the commit). This PR's own harness-CI is green (config-test unreachable → warning), but now the gap is loud + the tolerance can't outlive #209.Closes the loop
config-testDNS+cert). When that lands, the guard fails-loud → dropconfig-worker-unreachable+ this guard → step 21 runs live.🤖 Generated with Claude Code