A self-evolving coding agent in Rust.
Chat with it, let it reflect, and let it improve itself.
AutoHarness is a compact Rust agent that runs as an interactive REPL. It logs everything to .evo/, verifies self-edits with cargo build --release, and uses the LLM as the judge — no numeric reward model.
Type /evolve inside the running agent to trigger a reflection + self-improvement loop. When evolution finishes, the process re-execs itself with the updated binary automatically.
# Build
cargo build --release
# Run
./target/release/auto-harness
# Inside the REPL:
# /evolve — reflect on past sessions and rewrite the agent, then relaunch
# /exit — clean shutdownUse any OpenAI-compatible backend:
# Local model (Ollama)
export OPENROUTER_API_KEY=unused
export INFERENCE_BASE_URL=http://localhost:11434/v1
export MODEL_NAME=llama3
# OpenRouter
export OPENROUTER_API_KEY=<your-key>flowchart TD
A[auto-harness] --> B[interactive REPL]
B --> C{input}
C -->|/exit| Z[clean shutdown]
C -->|user message| E[LLM: chat + tools]
E --> C
C -->|/evolve| D[reflect → evolve → refine → lint/test → doc update]
D --> R[exec evolved binary]
- Async stdin queue (
VecDequefed by a background thread) - LLM decides if each message starts a new task or continues the current one
- Task artifacts go to
outputs/<ts>/task_N/ - All events logged to
.evo/sessions/<ts>/traj.jsonl - Slash commands:
/exit(quit),/evolve(evolve + relaunch)
- Reflect: analyze unprocessed trajectories (progressive disclosure — stripped summary first; LLM reads more via
read_file path start..end) → one concrete improvement suggestion - Evolve: unbounded iterations; LLM sees full prompt files,
AGENTS.md,memory/index (filepath + description), andmain.rs; proposes one change per iter; stops onSKIP - Refine: clippy + test output fed to LLM for
write_filefixes - Final lint/test:
cargo clippy --no-deps -- -D warnings+cargo test --release— authoritative gate - Doc update: rewrite
CLAUDE.mdandREADME.md(reflects the verified, working state) - Relaunch:
exec()replaces the process with the freshly-built binary
| Artifact | Tool | Notes |
|---|---|---|
src/main.rs |
write_file |
Atomic: backup → write → build-verify → restore on fail |
Any src/** non-.bak |
write_file |
All .rs, .md, .txt under src/ |
CLAUDE.md / README.md |
write_file |
Doc update step |
Both read_file and write_file accept an optional start..end char-offset range for partial reads/patches.
Evolution file rules (enforced at runtime):
write_fileallowed for anysrc/**path (non-.bak),CLAUDE.md,README.mdsrc/main.rswrites triggercargo build --release; failure auto-revertsdelete_filerestricted tosrc/;src/main.rsandsrc/AGENTS.mdare protected- All modified files auto-backed-up as
<stem>.<ts>.<ext>.bak
.
├── Cargo.toml
├── README.md
├── CLAUDE.md
├── src/
│ ├── main.rs
│ ├── AGENTS.md
│ ├── memory/ ← reference notes, evolved freely
│ └── prompts/
│ ├── chat_system.txt
│ ├── reflect_system.txt
│ ├── evolve_system.txt
│ └── doc_system.txt
├── .evo/
│ ├── sessions/<ts>/traj.jsonl
│ └── learned_until.txt
└── outputs/<ts>/task_N
| Variable | Default | Description |
|---|---|---|
OPENROUTER_API_KEY |
required | API key |
INFERENCE_BASE_URL |
https://openrouter.ai/api/v1 |
OpenAI-compatible API endpoint |
MODEL_NAME |
anthropic/claude-opus-4 |
Model identifier |
Core constants in src/main.rs:
SELF_PATH = "src/main.rs"— file the agent reads and rewritesWATERMARK_PATH = ".evo/learned_until.txt"— tracks last reflected session
The evolution loop is unbounded — it runs until the LLM replies SKIP.
@software{autoharness2026,
title = {AutoHarness: A Self-Evolving Coding Agent in Rust},
author = {Zhao, Zhimin},
year = {2026},
url = {https://github.com/Engineering4AI/AutoHarness}
}