Skip to content

cuabot shell execution loses exit status and timeout semantics #1403

@zouyonghe

Description

@zouyonghe

Summary

libs/cuabot currently flattens shell execution results in a way that makes failed commands look successful. This affects both the daemon API and the cuabot --bash CLI.

Confirmed behavior

  • libs/cuabot/src/cuabotd.ts returns only { stdout, stderr, pid? } from execInContainer().
  • Synchronous shell execution keeps only stdout and stderr, and does not return exit_code, success, or timeout metadata.
  • Failed shell commands are compressed into { stdout, stderr }, losing err.code, err.killed, and timeout information.
  • cuabot --bash "exit 42" exits locally with status 0.
  • cuabot --bash "sleep 65; echo after" times out without a structured timeout result.
  • Browser/GUI startup env only sets DISPLAY=:100; it does not explicitly set USER, LOGNAME, HOME, XAUTHORITY, or XDG_RUNTIME_DIR.

Reproduction

Examples observed locally:

cuabot --bash "exit 42"
echo $?
# expected: 42 or non-zero
# actual: 0

cuabot --bash "which firefox"
# remote command exits 1 when not found, but API does not expose structured exit code

cuabot --bash "sleep 65; echo after"
# expected: structured timeout / non-zero result
# actual: empty stdout/stderr without timeout metadata

Expected contract

Shell execution should return structured status, for example:

{
  stdout: string,
  stderr: string,
  exit_code: number | null,
  success: boolean,
  timed_out?: boolean,
  signal?: string,
  pid?: number
}

The CLI should also use this status:

  • success=false should exit non-zero.
  • Non-zero remote exit_code should propagate or map to non-zero.
  • Timeouts should produce a non-zero local exit code and explicit timeout output.

Notes

The Python cua-sandbox layer already has some failure/timeout handling tests and result-shape protections, so this appears to be specifically in the cuabot shell/GUI execution contract rather than the entire SDK shell API.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions