Skip to content

fix(browser): harden agent consult reliability#177

Merged
steipete merged 3 commits intosteipete:mainfrom
pdurlej:pdurlej/agent-ux-foundation
May 5, 2026
Merged

fix(browser): harden agent consult reliability#177
steipete merged 3 commits intosteipete:mainfrom
pdurlej:pdurlej/agent-ux-foundation

Conversation

@pdurlej
Copy link
Copy Markdown
Contributor

@pdurlej pdurlej commented May 5, 2026

Summary

  • hardens ChatGPT browser model selection so gpt-5.5-pro fails loudly if the UI resolves to Thinking instead of Pro/Extended
  • updates the MCP chatgpt-pro-heavy preset and dry-run output so Claude/Codex/OpenCode can see the resolved engine/model/browser settings before live execution
  • captures DOM/screenshot diagnostics on assistant response timeouts and marks sessions error + incomplete_capture instead of leaving fake-running sessions
  • preserves headful browser sessions for assistant-timeout / assistant-recheck without reclassifying them as Cloudflare challenges

Notes

This is PR 0 of a larger agent UX hardening roadmap. It is intentionally foundation-only: no auto-archive/project mutation behavior is added here.

Oracle dogfood

  • First GPT-5.5 Pro Extended review returned NOT ACCEPTABLE and found issues in browserModelStrategy:"current", Pro validation, timeout diagnostics, MCP dry-run coverage, and browser preservation. Accepted and fixed the actionable points.
  • Second GPT-5.5 Pro Extended review found assistant timeout preservation was being routed through Cloudflare guidance. Accepted and fixed.
  • Final focused GPT-5.5 Pro Extended review (agent-ux-preserve-error-check) returned ACCEPTABLE with no blocking issues.

Tests

  • pnpm vitest run tests/browser/modelSelection.test.ts tests/browser/index.test.ts tests/mcp/consult.test.ts tests/cli/sessionRunner.test.ts tests/browser/domDebug.test.ts
  • pnpm run check
  • pnpm run build
  • pnpm test
  • git diff --check

No secrets or account-specific details are included.

@pdurlej pdurlej force-pushed the pdurlej/agent-ux-foundation branch from 1941df0 to 42c6a8f Compare May 5, 2026 20:45
@pdurlej
Copy link
Copy Markdown
Contributor Author

pdurlej commented May 5, 2026

Rebased onto current upstream main after the live ChatGPT tab harvest landed. Conflict resolution kept the new live-tab imports/tests and this PR’s reliability hardening. Local focused tests, pnpm run check, pnpm run build, and git diff --check passed; GitHub CI is green on macOS, Ubuntu, and Windows.

@steipete steipete merged commit 8dc0b41 into steipete:main May 5, 2026
4 checks passed
@steipete
Copy link
Copy Markdown
Owner

steipete commented May 5, 2026

Landed in 8dc0b41. Thank you @pdurlej.

I kept the PR shape and added two small verified fixups before merge:

  • Treat gpt-5.5-pro + browserThinkingTime:"extended" as Pro Extended once the picker resolves to Pro, so the current ChatGPT UI no longer logs a misleading missing thinking-chip fallback.
  • Clear transient browser errorMessage metadata on successful completion/reattach, which a live run exposed after Chrome cleanup raced session metadata reads.

Verification:

  • Local full gate: pnpm docs:list && pnpm test && pnpm run check && pnpm run build && git diff --check
  • Live ChatGPT browser smoke: cookie sync copied 6 cookies, login probe passed, Model picker: Pro, Thinking time: Pro Extended (via model selection), output marker PR177_LIVE_OK_2, session status completed, no stale errorMessage.
  • PR CI green on macOS, Ubuntu, Windows.
  • Main push CI green on macOS, Ubuntu, Windows: https://github.com/steipete/oracle/actions/runs/25407927294

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants