Coding Agents in Practice (4/5) — Multi-Agent Patterns: Orchestrator and Specialist Separation

5월 05, 2026

AI 코딩 에이전트 실전 (4/5) — 멀티 에이전트 패턴: 오케스트레이터와 전문가 분리

When a single LLM session does everything, the context breaks. Role separation fixes it.

핵심 요약

Multi-agent is not "let's use more model" — it's "let's separate context"
Primary sources: Anthropic Claude Code subagents docs; OpenClaw / Hermes operational experience
Three core patterns: orchestrator + specialists / pipeline / collaboration
Cost is not a simple multiplication — context savings and reduced retries often make multi-agent cheaper overall
Common pitfalls: permission leakage, context contamination, debugging difficulty

1. Why one agent doing everything fails

When a single LLM session handles design + implementation + testing + review + docs, three failure modes appear:

Context contamination: details from one task pollute another. Debug-time scratch code becomes a PR-review baseline.
Context limit overflow: even 200K tokens fill quickly under heavy load. Auto-compaction shaves the reasoning behind decisions.
Permission asymmetry: the same session has publish + credential access + code change authority. One mistake has wide blast radius.

Solution: separate agents by task type and have one orchestrator on top.

2. Orchestrator + specialists pattern

The most common topology.

        [Orchestrator = main]
            │
   ┌────────┼────────┬────────┐
   ▼        ▼        ▼        ▼
 [Edit]  [Test]  [Review]  [Document]
 executor  tester  reviewer   writer

Orchestrator's role: - Receive a request and decide which specialist to call. - Absorb specialist results in summary form (not full output). - Decide the next step.

Specialist's role: - Work deeply in their specialty. - Return a condensed report — protecting the orchestrator's context.

Core principle: orchestrator sees the whole picture, specialists see depth. Two kinds of context, neither overloaded.

3. Three topologies

3.1 Supervisor (hub-and-spoke)

One orchestrator + N specialists, as above.
Simple, predictable, easy to debug.
Limit: when specialists need to talk directly, every hop goes through the orchestrator.

3.2 Pipeline (serial)

Specialist A → B → C, output of one becomes input of the next.
Strong fit for clear-stage work — data transform, validation, publish.
Limit: a failure at one stage halts everything.

3.3 Collaboration (mesh / consensus)

Multiple specialists see the same input in parallel and produce answers; results are merged or voted.
Good for code review where different lenses (security / performance / readability) matter.
Most expensive: same tokens processed N times.

Selection rule: clear stages → pipeline; clear division of labor → supervisor; different lenses → collaboration.

4. Claude Code's `subagents` — Real implementation

Claude Code calls subagents through the Agent tool, with subagent_type selecting the specialist type and a self-contained prompt.

Prompt-writing principles (where most teams stumble): - The subagent does not know the main agent's context. Put what / why / what's been tried all into the prompt. - Specify the output shape: "report in 200 words," "JSON result," "return only the URL," etc. - Don't delegate understanding: instead of "analyze and decide," say "in file X line Y, change Z to W." The main agent must understand first, then delegate.

Parallel vs serial: - Independent investigations can fan out in parallel. - Dependent work runs serial. - Parallel is faster but the total token cost is roughly the same.

5. Cost — Not a simple multiplication

"Multi-agent = N× cost" is half-true.

What costs more: - Each subagent re-loads its own context (CLAUDE.md, rules, tool defs). - Orchestrator-specialist message round-trips themselves cost tokens.

What costs less: - The main context stays light, so less compaction — and compaction triggers re-asks. - Specialists can run on smaller models (e.g., Haiku) when the work allows. - Fewer retries — when the main is contaminated, decision quality drops and whole tasks restart.

Heuristic: simple 1–2 step tasks → single agent, almost always cheaper. 5+ step or mixed-specialty work → multi-agent often wins on total cost.

6. Four common pitfalls

6.1 Cross-contamination

The main agent ingests a specialist's full output and stores it in memory → contamination.
Fix: specialists return summaries only; the main agent trusts only the summary.

6.2 Permission leakage

Every agent gets every tool → wide blast radius from one mistake.
Fix: per-agent tool whitelists. Only a designated agent can publish.

6.3 Debugging difficulty

"Which agent went wrong, and where?" is hard to trace.
Fix: every agent call logs a correlation ID + result summary. At session end, archive logs separately.

6.4 Over-engineering

Inserting multi-agent into trivial work.
Fix: a hard rule like "3+ step tasks enter multi-agent flow." Below that, single agent + self-check.

7. At a glance

Pattern	Best for	Strength	Weakness
Orchestrator + specialists (supervisor)	Clear division of labor	Simple, easy to debug	Specialists can't talk directly
Pipeline	Clear stages	Trace easy	One stage fails → whole halts
Collaboration (mesh / consensus)	Different lenses	Diversity	Most expensive
Single agent	≤2 step tasks	Cheapest	Context contamination risk

Design rule: task types diverge → agents diverge. Otherwise, single agent.

Next up

Part 5/5: Coding Agent Cost Management — Tokens, Caching, Routing. If multi-agent splits structure, the next post splits cost.

References

Anthropic, Subagents in Claude Code — code.claude.com/docs/agents (verified 2026-05-05).
Anthropic, How We Built Our Multi-Agent Research System — anthropic.com/research/built-multi-agent-research-system (verified 2026-05-05).
The "What Is Harness Engineering?" series — theoretical background.

This is part 4/5 of the Coding Agents in Practice series.

이 블로그 검색

MaJu Tech Notes