Skip to content

Commit d3471a8

Browse files
authored
Merge pull request #226 from haorui-harry/main
compiles valuable GUI trajectories into parameterized, self-verifying CLI macros for agents
2 parents e1e48ba + d6002a1 commit d3471a8

30 files changed

Lines changed: 4887 additions & 0 deletions
Lines changed: 185 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,185 @@
1+
# OpenClaw Macro System — Agent Harness SOP
2+
3+
## What Is This?
4+
5+
**OpenClaw Macro System** is a layered CLI that turns valuable GUI workflows into
6+
parameterized, agent-callable macros. The agent sends one command:
7+
8+
```bash
9+
cli-anything-openclaw macro run export_png --param output=/tmp/out.png --json
10+
```
11+
12+
The system handles everything else: parameter validation, precondition checks,
13+
backend selection, step execution, postcondition verification, and structured
14+
result output. The agent never touches the GUI directly.
15+
16+
## Architecture
17+
18+
```
19+
Agent
20+
└─▶ cli-anything-openclaw macro run <name> --param k=v --json (L6: CLI)
21+
22+
MacroRuntime (L5)
23+
│ 1. Validate params against MacroDefinition schema
24+
│ 2. Check preconditions (file_exists, process_running, …)
25+
│ 3. For each step:
26+
│ RoutingEngine → select backend by priority (L3)
27+
│ Backend.execute(step, resolved_params) (L2)
28+
│ 4. Check postconditions
29+
│ 5. Collect declared outputs
30+
│ 6. Record telemetry in ExecutionSession
31+
└─▶ { success, output, error, telemetry }
32+
```
33+
34+
## Layer Mapping
35+
36+
| Layer | Name | Implementation |
37+
|-------|------|---------------|
38+
| L7 | Agent Task Interface | Caller (OpenClaw or any agent) |
39+
| L6 | Unified CLI Entry | `openclaw_cli.py` — Click CLI |
40+
| L5 | Macro Execution Runtime | `core/runtime.py` |
41+
| L4 | Parameterized Macro Model | `core/macro_model.py` + `macro_definitions/*.yaml` |
42+
| L3 | Backend Routing Engine | `core/routing.py` |
43+
| L2 | Execution Backends | `backends/` (5 backends) |
44+
| L1 | Target Application | Any GUI-first or closed-source app |
45+
46+
## Execution Backends
47+
48+
| Backend | Priority | Trigger | Use case |
49+
|---------|----------|---------|----------|
50+
| `native_api` | 100 | `backend: native_api` | subprocess / shell commands |
51+
| `gui_macro` | 80 | `backend: gui_macro` | precompiled coordinate replay (pyautogui) |
52+
| `file_transform` | 70 | `backend: file_transform` | XML, JSON, text file editing |
53+
| `semantic_ui` | 50 | `backend: semantic_ui` | accessibility API + keyboard (xdotool) |
54+
| `recovery` | 10 | `backend: recovery` | retry + fallback orchestration |
55+
56+
The RoutingEngine respects the step's explicit `backend:` field; if that backend
57+
is unavailable it walks down the priority list.
58+
59+
## Macro Definition Format
60+
61+
Macros live in `cli_anything/openclaw/macro_definitions/` as YAML files:
62+
63+
```yaml
64+
name: export_png
65+
version: "1.0"
66+
description: Export the active diagram to PNG.
67+
68+
parameters:
69+
output:
70+
type: string
71+
required: true
72+
example: /tmp/diagram.png
73+
74+
preconditions:
75+
- process_running: draw.io
76+
- file_exists: /path/to/input.drawio
77+
78+
steps:
79+
- id: export
80+
backend: native_api
81+
action: run_command
82+
params:
83+
command: [draw.io, --export, --output, "${output}", input.drawio]
84+
timeout_ms: 30000
85+
on_failure: fail # or: skip | continue
86+
87+
postconditions:
88+
- file_exists: ${output}
89+
- file_size_gt:
90+
- ${output}
91+
- 100
92+
93+
outputs:
94+
- name: exported_file
95+
path: ${output}
96+
97+
agent_hints:
98+
danger_level: safe
99+
side_effects: [creates_file]
100+
reversible: true
101+
```
102+
103+
### Supported Condition Types
104+
105+
| Type | Args | Checks |
106+
|------|------|--------|
107+
| `file_exists` | path | `os.path.exists(path)` |
108+
| `file_size_gt` | [path, min_bytes] | `os.stat(path).st_size > min_bytes` |
109+
| `process_running` | name | `pgrep -x name` or psutil |
110+
| `env_var` | name | `name in os.environ` |
111+
| `always` | true/false | constant pass/fail |
112+
113+
## Package Layout
114+
115+
```
116+
openclaw-skill/
117+
└── agent-harness/
118+
├── setup.py entry_point: cli-anything-openclaw
119+
└── cli_anything/openclaw/
120+
├── openclaw_cli.py Main Click CLI
121+
├── macro_definitions/ YAML macro registry
122+
│ ├── manifest.yaml
123+
│ └── examples/
124+
│ ├── export_file.yaml
125+
│ ├── transform_json.yaml
126+
│ └── undo_last.yaml
127+
├── core/
128+
│ ├── macro_model.py MacroDefinition + YAML loader
129+
│ ├── registry.py MacroRegistry
130+
│ ├── routing.py RoutingEngine
131+
│ ├── runtime.py MacroRuntime (full lifecycle)
132+
│ └── session.py ExecutionSession + telemetry
133+
├── backends/
134+
│ ├── base.py Backend ABC + StepResult
135+
│ ├── native_api.py subprocess backend
136+
│ ├── file_transform.py XML/JSON/text backend
137+
│ ├── semantic_ui.py accessibility backend
138+
│ ├── gui_macro.py compiled replay backend
139+
│ └── recovery.py retry/fallback backend
140+
├── skills/SKILL.md Agent-readable skill definition
141+
├── utils/repl_skin.py Unified REPL skin (cli-anything standard)
142+
└── tests/
143+
├── test_core.py Unit tests (49 tests, no external deps)
144+
└── test_full_e2e.py E2E + CLI subprocess tests (15 tests)
145+
```
146+
147+
## Installation
148+
149+
```bash
150+
cd openclaw-skill/agent-harness
151+
pip install -e .
152+
```
153+
154+
**Runtime dependencies:** Python 3.10+, PyYAML, click, prompt-toolkit.
155+
156+
**Optional (for specific backends):**
157+
- `xdotool` — semantic_ui backend on Linux
158+
- `pyautogui` — gui_macro backend
159+
- `psutil` — richer process_running checks
160+
161+
## Running Tests
162+
163+
```bash
164+
cd openclaw-skill/agent-harness
165+
python3 -m pytest cli_anything/openclaw/tests/ -v -s
166+
# 64 passed
167+
```
168+
169+
## Key Design Decisions
170+
171+
**Why YAML macros, not Python?** YAML macros are readable by agents without
172+
running code, inspectable via `macro info`, and editable without touching the
173+
harness source.
174+
175+
**Why 5 backends?** Real GUI applications expose many different control
176+
surfaces. The routing engine picks the most reliable one available — the agent
177+
doesn't need to know which one ran.
178+
179+
**Why preconditions and postconditions?** Agents operate in environments where
180+
state is uncertain. Failing loudly before execution (preconditions) and
181+
verifying after (postconditions) catches problems the agent can act on.
182+
183+
**Why `on_failure: skip | continue`?** Some macro steps are best-effort (e.g.,
184+
confirming a dialog that may or may not appear). Skipping lets the macro
185+
continue to the real work.
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# OpenClaw Macro System
2+
3+
**OpenClaw Macro System** is a layered CLI that converts GUI workflows into
4+
parameterized, agent-callable macros. Agents call `macro run <name>` through
5+
a stable CLI; the system routes execution to the right backend (native plugin,
6+
file transform, semantic UI, or compiled GUI replay) — invisible to the agent.
7+
8+
## Installation
9+
10+
```bash
11+
pip install -e .
12+
```
13+
14+
**Dependencies:** Python 3.10+, PyYAML, click, prompt-toolkit
15+
16+
## Usage
17+
18+
```bash
19+
# List available macros
20+
cli-anything-openclaw macro list --json
21+
22+
# Inspect a macro
23+
cli-anything-openclaw macro info export_file --json
24+
25+
# Execute a macro
26+
cli-anything-openclaw macro run transform_json \
27+
--param file=/tmp/config.json \
28+
--param key=theme --param value=dark --json
29+
30+
# Dry run
31+
cli-anything-openclaw --dry-run macro run export_file \
32+
--param output=/tmp/out.txt --json
33+
34+
# Interactive REPL
35+
cli-anything-openclaw
36+
```
37+
38+
## Run Tests
39+
40+
```bash
41+
cd openclaw-skill/agent-harness
42+
pip install -e ".[dev]"
43+
python -m pytest cli_anything/openclaw/tests/ -v -s
44+
```
45+
46+
## Architecture
47+
48+
```
49+
cli-anything-openclaw (CLI)
50+
└─▶ macro run <name> --param key=value
51+
52+
MacroRuntime
53+
│ validate params
54+
│ check preconditions
55+
│ for each step:
56+
│ RoutingEngine → select backend
57+
│ Backend.execute(step, params)
58+
│ check postconditions
59+
└─▶ ExecutionResult { success, output, telemetry }
60+
```
61+
62+
**Backends:**
63+
- `native_api` — subprocess / shell commands
64+
- `file_transform` — XML, JSON, text file editing
65+
- `semantic_ui` — accessibility controls + keyboard shortcuts
66+
- `gui_macro` — precompiled coordinate-based replay
67+
- `recovery` — retry + fallback orchestration
68+
69+
## Adding a Macro
70+
71+
1. Create `cli_anything/openclaw/macro_definitions/my_macro.yaml`
72+
2. Add it to `macro_definitions/manifest.yaml`
73+
3. Verify: `cli-anything-openclaw macro validate my_macro --json`
74+
75+
See `skills/SKILL.md` (installed with the package) for full macro YAML schema.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# cli_anything/openclaw package
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
"""Enable: python3 -m cli_anything.openclaw"""
2+
from cli_anything.openclaw.openclaw_cli import cli
3+
4+
if __name__ == "__main__":
5+
cli()

openclaw-skill/agent-harness/cli_anything/openclaw/backends/__init__.py

Whitespace-only changes.
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
"""Backend base classes and result types.
2+
3+
All execution backends inherit from Backend and return StepResult.
4+
"""
5+
6+
from __future__ import annotations
7+
8+
import time
9+
from abc import ABC, abstractmethod
10+
from dataclasses import dataclass, field
11+
from typing import Any, Optional
12+
13+
14+
@dataclass
15+
class StepResult:
16+
"""Result of a single macro step execution."""
17+
success: bool
18+
output: dict = field(default_factory=dict)
19+
error: str = ""
20+
duration_ms: float = 0.0
21+
backend_used: str = ""
22+
23+
def to_dict(self) -> dict:
24+
return {
25+
"success": self.success,
26+
"output": self.output,
27+
"error": self.error,
28+
"duration_ms": self.duration_ms,
29+
"backend_used": self.backend_used,
30+
}
31+
32+
33+
class BackendContext:
34+
"""Runtime context passed to each backend during step execution."""
35+
36+
def __init__(
37+
self,
38+
params: dict,
39+
previous_results: Optional[list[StepResult]] = None,
40+
dry_run: bool = False,
41+
timeout_ms: int = 30_000,
42+
):
43+
self.params = params
44+
self.previous_results: list[StepResult] = previous_results or []
45+
self.dry_run = dry_run
46+
self.timeout_ms = timeout_ms
47+
self._start = time.time()
48+
49+
def elapsed_ms(self) -> float:
50+
return (time.time() - self._start) * 1000
51+
52+
53+
class Backend(ABC):
54+
"""Abstract base class for all execution backends.
55+
56+
Concrete backends implement execute() and return a StepResult.
57+
"""
58+
59+
name: str = "base"
60+
priority: int = 0
61+
62+
@abstractmethod
63+
def execute(
64+
self,
65+
step: "MacroStep", # type: ignore[name-defined]
66+
params: dict,
67+
context: BackendContext,
68+
) -> StepResult:
69+
"""Execute a macro step.
70+
71+
Args:
72+
step: The MacroStep definition being executed.
73+
params: Fully resolved (substituted) parameters.
74+
context: Runtime context with previous results and flags.
75+
76+
Returns:
77+
StepResult describing success/failure and captured output.
78+
"""
79+
80+
def is_available(self) -> bool:
81+
"""Return True if this backend can be used in the current environment."""
82+
return True
83+
84+
def describe(self) -> dict:
85+
return {
86+
"name": self.name,
87+
"priority": self.priority,
88+
"available": self.is_available(),
89+
}

0 commit comments

Comments
 (0)