This document explains why multi-agent architecture is needed and how to design it.
Run Claude Code and enter this prompt:
Clone https://github.com/ailnk0/claude-multiagent-template.git and cd into it.
Then read docs/example-task.md and execute each Task using the Task tool (subagent).
After each Task completes, update the Progress Tracker and record the deliverables.
What the example does:
Task 0: Market Research (WebSearch to select 3 competitors)
│
├── Task 1a: Competitor A Research ─┐
├── Task 1b: Competitor B Research ─┼── (parallel)
└── Task 1c: Competitor C Research ─┘
│
▼
Task 2: Comparative Analysis (feature matrix, insights)
│
▼
Task 3: HTML Report (open in browser)
Final output: output/report.html - professional competitor analysis report
Copy docs/task-template.md and work with Claude to customize it for your needs.
This step requires significant effort. Work iteratively with Claude to define:
- Clear goals for each task
- Prerequisites and dependencies
- Success criteria (must be self-verifiable by the agent)
- File-based knowledge transfer between tasks
The quality of your task definitions determines the success of multi-agent execution. Read the guide below to understand the principles.
That's it. For details on why this works and how to customize, read on.
Common issues when running complex tasks with Claude Code:
Attempt approach A
↓ fails
Switch to approach B
↓ fails
Return to approach A ← failure loop
↓ fails
...repeat
Cause 1: Context is finite
- Claude Code's context window is approximately 200K tokens
- Large codebase + long conversation = quickly hitting limits
Cause 2: Context rot
- As conversation lengthens, earlier decisions get "buried"
- When Auto Compact runs, previous context is summarized and details are lost
- Agent forgets "why we abandoned approach A earlier"
Main Agent (Orchestrator)
│
├── Sub-agent 1: Execute Task 0 → Record results
│
├── Sub-agent 2: Execute Task 1 (reads Task 0 results) → Record results
│
└── Sub-agent 3: Execute Task 2 (reads Task 1 results) → Record results
Each sub-agent:
- Starts with fresh context
- Reads only necessary information
- Explicitly records results before terminating
Bad example:
Task 1: Refactor entire system
Good example:
Task 1: Extract UserService class
Task 2: Define UserRepository interface
Task 3: Write UserService tests
Criteria:
- Can it be completed in one session without Auto Compact?
- Is there a single goal?
- Can success/failure be clearly determined?
Agents don't share conversation history. Knowledge transfer must happen through files.
### Output
Files for the next Task:
- `docs/research/api-spec.json`
- `docs/research/decisions.md`### Prerequisites
- Task 0 completed
- `docs/research/api-spec.json` file existsBad example:
- [ ] Code is clean
- [ ] Performance is good
Good example:
- [ ] `./build.sh` succeeds (exit code 0)
- [ ] `./test.sh` all tests pass
- [ ] `./benchmark.sh` result < 100ms
The agent must be able to check the boxes itself.
### Dependencies
- Task 0 (needs api-spec.json)
- Task 1 (needs MyService.java)This enables:
- Main agent to determine execution order
- Identify parallel execution opportunities
- Decide where to restart on failure
The most basic structure:
Task 0: Research
│ - Collect API specs
│ - Analyze existing code
│ - Document constraints
▼
Task 1: Implementation
│ - Implement core functionality
│ - Reference Task 0 results
▼
Task 2: Verification
- Run tests
- Document results
Develop independent modules simultaneously:
Task 0: Research
│
├── Task 1a: Implement Module A ──┐
│ │
├── Task 1b: Implement Module B ──┼── Task 2: Integration
│ │
└── Task 1c: Implement Module C ──┘
│
▼
Task 3: Verification
When optimization or tuning is needed:
Task 0: Measure baseline
│
▼
Task 1: Experiment 1 → Record results
│
▼
Task 2: Experiment 2 (reference Task 1 results) → Record results
│
▼
Task 3: Experiment 3 (reference Task 1,2 results) → Record results
│
▼
Task 4: Select and apply optimal configuration
Key: Each experiment result is recorded to file, so the next agent knows "what was already tried."
Some decisions cannot be made before execution.
Need to call API in Task 0 to know response structure
→ Task 1's parsing logic depends on Task 0 results
## Decision Points (Runtime-Dependent)
### After Task 0 (based on API response)
| Decision | How to Verify | Impact |
|----------|---------------|--------|
| Response structure | Check Task 0 result JSON | Task 1 parsing logic |
| Pagination method | Check API response headers | Task 1 iteration logic |After Task 0 completes, humans review Decision Points and update Task 1 specs.
Symptom: A → B → A → B failure loop
Cause:
- Task is too large (Auto Compact occurs)
- Previous attempt results not recorded
Solution:
- Split task into smaller pieces
- Record each attempt result to file
### Output
- `docs/experiments/attempt-1.md` - Failure reason: X
- `docs/experiments/attempt-2.md` - Failure reason: YSymptom: Task 1 doesn't read Task 0 results
Cause:
- Prerequisites not clear
- Missing "Context for Agent" section
Solution:
### Context for Agent
Instructions for the agent performing this task:
1. First read `docs/research/api-spec.json`
2. The `endpoints` section in this file is the implementation target
3. Reference existing code patterns in `src/existing/`Symptom: Task 1a and Task 1b modify the same file
Cause:
- Output scope overlaps between parallel tasks
Solution:
- Clearly separate Output paths for each task
- Common files modified only in Integration Task
Task 1a Output: `src/moduleA/` (modify only this folder)
Task 1b Output: `src/moduleB/` (modify only this folder)
Task 2 (Integration): `src/main.java` (integration)Research → Implementation → Verification
Research → Design → Implementation → Verification
- Architecture decisions in Design Task
- Implementation follows Design document
Research → Design → Implementation → Integration → Verification
- When implementing multiple modules separately
- Integration connects modules
Research → Design ─┬→ Impl A ─┐
├→ Impl B ─┼→ Integration → Verification
└→ Impl C ─┘
When creating a new Task, verify:
- Completable in one session?
- Single goal?
- Can finish before Auto Compact?
- All Prerequisites listed?
- Required file paths specified?
- Context for Agent section exists?
- Output file paths specified?
- Clear how next Task uses this Output?
- Success Criteria specific?
- Agent can check boxes itself?
- Test Method is executable command?
- All Dependencies listed?
- No circular dependencies?
- Parallel execution possibility clear?
Using Claude Code's Task tool runs each Task in a separate context. This is the most effective way to prevent context rot.
Read task.md and execute each Task as a separate Task.
After each Task completes, update the Progress Tracker,
and get my confirmation before starting the next Task.
When to use:
- 4 or more Tasks
- Each Task is complex (code generation, analysis, etc.)
- Failure loops are occurring
Open a new session or use /clear for each Task.
This is the most reliable way to separate context.
Session 1:
Execute only Task 0 from task.md.
When complete, update Progress Tracker,
and tell me the Output file paths.
Session 2 (new window or after /clear):
Execute Task 1 from task.md.
Task 0 results are in docs/research/.
Check Prerequisites and begin.
When to use:
- Human review required between Tasks
- Want to carefully check each Task result
- First time trying multi-agent
Can be used simply for small projects. However, due to context rot risk, only recommended when 3 or fewer Tasks.
Read task.md and execute sequentially from Task 0.
Get my confirmation after each Task before proceeding.
Update Progress Tracker on each Task completion.
When to use:
- 3 or fewer Tasks
- Each Task is simple (file copy, simple edits, etc.)
- Rapid prototyping
Used when running independent Tasks simultaneously. Run Claude Code simultaneously in multiple terminals/windows.
Window 1:
Execute only Task 1a from task.md.
Write Output only to src/moduleA/.
Do not modify other folders.
Window 2 (simultaneously):
Execute only Task 1b from task.md.
Write Output only to src/moduleB/.
Do not modify other folders.
When to use:
- Tasks with no Dependencies exist
- Want to reduce time
- Output paths for each Task are clearly separated
- Read Prerequisites files
- Verify Success Criteria
- Update Progress Tracker
- Record deliverables to Output path
- Review results at Task transitions
- Make directional decisions (when multiple options exist)
- Instruct re-execution when quality is insufficient
- Split Tasks or modify specs on failure
Task 0: Research
│
├── Task 1a: Feature A Implementation ─┐
├── Task 1b: Feature B Implementation ─┼── (parallel)
└── Task 1c: Feature C Implementation ─┘
│
▼
Task 2: Integration & Verification
For small projects without parallel needs:
Task 0: Research
Task 1: Core Implementation
Task 2: Verification
For complex projects:
Task 0: Research
Task 1: Design (NEW)
│
├── Task 2a: Feature A Implementation ─┐
├── Task 2b: Feature B Implementation ─┼── (parallel)
└── Task 2c: Feature C Implementation ─┘
│
▼
Task 3: Integration & Verification
- Separate Output paths: Each Task writes to different folders
- Warning in Context for Agent: "Do NOT modify other folders"
- Integration Task required: Task to merge parallel results is mandatory
If the agent repeats the same mistakes:
- Is scope focused on a single goal?
- Can one agent complete within context?
- Should the Task be split smaller?
- Are Prerequisites clear?
- Is previous Task's Output path recorded?
- Does Context for Agent section have necessary info?
- Can agent verify Success Criteria itself?
- Is Test Method specific?
The key to multi-agent is memory, not intelligence.
- Context is limited, and earlier decisions get buried as it lengthens
- Both the size and structure of memory must be engineered
Split tasks, transfer knowledge via files, and make success criteria clear.
Then Claude Code escapes the failure loop.
v1.0 - 2025-01-06