Agent Operations Design Notes (3/9) — Agent Team Design 101

Adding more agents does not automatically make a system smarter. More often, it makes responsibility blurrier, verification more expensive, and context conflicts easier to trigger. Good separation is not about building a more impressive structure. It is about building a structure where failure stays cheaper and easier to detect.


Key Takeaways

  • The default is usually a single agent. As structure expands, coordination cost and verification cost usually expand with it.
  • A subagent is worth adding only when the ownership boundary is clear, not because the role label sounds elegant.
  • A multi-agent design becomes valuable only when you can afford the cost of conflict management, not just when parallelism looks attractive.
  • The useful design questions come first: who can modify what, who is accountable for verification, and who merges final context before a decision.
  • In real operations, systems usually break less from raw reasoning quality than from context collisions, duplicated work, missed verification, and unclear responsibility.

1. Start with smaller definitions

For practical work, three structures are enough:

  • single agent: one working context handles planning, execution, and checking
  • subagent: a lead agent delegates a bounded slice of work and later integrates the result
  • multi-agent: several agents collaborate with relatively independent roles, state, or authority

The name matters less than the responsibility model behind it.

Anthropic's subagent model is close to a delegated unit with its own context window, tool limits, and permission surface. OpenAI Agents SDK handoffs are closer to transferring the conversation to another specialized agent. Some systems may look multi-agent in the UI while behaving more like one main agent with an extended tool surface. Others may look single-agent while internally separating execution, review, and routing.

So the real questions are simpler:

  1. Who can modify which assets?
  2. Who owns verification failure?
  3. Who reconciles context before the final decision?

2. Why most teams should start with a single agent

The biggest advantage of a single agent is not simplicity for its own sake. It is that conflict tends to stay visible in one place.

If planning goes off course, you can inspect the instructions. If the wrong file gets touched, you can narrow the scope. If verification is missing, you can add it inside the same flow. The failure location stays relatively legible.

As soon as you split the structure, you are no longer dealing with one problem.

  • the original task problem
  • the coordination problem between agents

Many teams underestimate the second category. Once you separate roles, new costs appear quickly:

  • handoff format design
  • repeated context transfer
  • merge criteria
  • verification ownership
  • retry paths after failure

That is why structure should not grow until the limits of a single flow are visible. Before adding more agents, it is usually cheaper to improve:

  • task scope
  • checklists
  • mid-stream verification checkpoints
  • separation between read-only stages and write stages

3. When subagents are actually useful

Subagents are best understood as a middle structure between one agent and a full team. Their value is not just that the role is different. Their value is that the boundary is short and the result is easy to recover.

Subagents tend to fit when:

  • only a specific file set or question needs investigation
  • you want several independent draft candidates for comparison
  • background research or auxiliary review should not flood the main context
  • the returned output is easy to merge, such as a summary, list, or candidate set

They tend to fit poorly when:

  • multiple subagents must edit the same file
  • the interpretation criteria are fuzzy
  • outputs are likely to be applied without verification
  • the lead agent still has to reread large raw context dumps to make sense of the result

In other words, a subagent should behave like a context compression device for the main agent. If it sends back a large block of unresolved context, much of the value of delegation is gone.

4. When multi-agent design becomes justified

Multi-agent structure starts to make sense not when role names multiply, but when different ownership and verification rules must coexist at the same time.

Typical examples:

  • generation and verification must be separated intentionally
  • external execution and internal approval need different authority
  • long-running asynchronous work has to operate in multiple lanes
  • each agent needs a distinct view of tools, data, or permissions

Even then, designing for parallelism alone is a common mistake. Parallel work is valuable only while shared understanding remains stable. The moment agents start operating on different assumptions, speed gains turn into reconciliation waste.

Multi-agent design earns its keep only if you can answer:

  • who guarantees the quality of each agent's input
  • who has priority when results conflict
  • where global failure is detected
  • who counts as the final author of the outcome

Without those answers, the system may look like specialization while functioning as responsibility diffusion.

5. First test: are ownership boundaries sharp enough

The first design question is not performance. It is ownership.

Good separation is usually described with sentences like these:

  • this agent only investigates
  • this agent only drafts
  • this agent only verifies
  • this agent only edits one designated file

Bad separation sounds like this:

  • both agents write, but one is more strategic
  • the reviewer can also patch things when needed
  • the researcher can produce the final version if the situation is urgent

The second set feels flexible at first, but it often creates edit conflicts and unclear accountability later.

In practice, permissions should usually be narrower than role descriptions.

Structure Recommended ownership pattern
single agent keep read, write, and verification responsibility coherent inside one bounded scope
subagent narrow the readable surface or the return format tightly
multi-agent separate modification rights, approval rights, and verification rights explicitly

As the number of agents rises, it becomes more important to define not just what each agent can do, but what each agent must not do.

6. Second test: does verification get cheaper or more expensive

This is where many designs collapse.

Splitting roles can make each role more legible, but it often makes the full result harder to trust. A single agent may only need verification like:

  • did it stay in scope
  • did it match the required format
  • did it support the key claim correctly

Once subagents or multiple agents enter the picture, extra verification layers usually appear:

  • was handoff information lost
  • did agents work from different assumptions
  • did edits overlap or conflict
  • did the merge distort meaning

So verification no longer happens only once at the end. It becomes layered:

  1. internal verification inside each agent
  2. handoff verification
  3. merge verification
  4. final artifact verification

If your team is not ready to pay for those layers, the better move is often to keep one main flow and insert verification checkpoints inside it.

The practical decision rule is simple:

  • split when verification becomes cheaper after separation
  • wait when verification becomes more expensive after separation

7. Third test: does the design reduce or amplify context collisions

In operations, systems often fail not because the model is weak, but because multiple contexts stay alive at once and start conflicting.

Context collisions usually show up in four forms:

  • the same goal is interpreted differently
  • the same file state is remembered differently
  • only some agents update to the latest decision
  • the same term gets reused with different meanings

With a single agent, these conflicts usually surface inside one flow. With multiple agents, each side may build a plausible intermediate output before the contradiction becomes visible. That makes the conflict more expensive, because each output already looks partially complete.

To reduce context collisions, it is often better to tighten the shared frame before adding agents:

  • lock the common goal statement into one short sentence
  • predeclare which files are editable
  • summarize the latest decision in one paragraph
  • standardize the return format

Subagents work well when they shrink context. Multi-agent systems often fail when too much free-form context is left open between roles.

8. Practical decision table

The structure choice can be summarized without much drama.

Situation Single agent Subagent Multi-agent
Scope is small and clear best fit usually unnecessary usually unnecessary
Research and drafting should be loosely separated possible good fit often excessive
Several roles need to touch the same asset risky usually poor fit risky without strict coordination
Verifier independence matters limited supportive potentially strong fit
Parallelism benefit is large limited conditional potentially strong fit
Keeping context up to date is hard relatively safer requires care most fragile
Approval and operating rules are complex can become bloated fast conditional potentially strong fit

The recommended default sequence is usually:

  1. start with a single agent
  2. split only easy-to-recover work into subagents
  3. adopt multi-agent structure only when ownership and verification boundaries can be stated explicitly

9. Three common misunderstandings

Misunderstanding 1. More role names means more sophistication

Usually not. Reducing failure paths matters more than naming more roles.

Misunderstanding 2. Parallelism always raises productivity

Parallelism pays only when merge cost is low. If merging is expensive, you have not really accelerated the system. You have only made complexity simultaneous.

Misunderstanding 3. A separate verifier automatically raises quality

A verifier only works as an independent quality control layer if input criteria, checklists, and decision authority are defined along with the role. A title alone is not a verification system.

10. A checklist for teams that want to start small

Before splitting the system, answer these five questions:

  1. Is the most expensive failure today a generation-quality problem, or a coordination problem?
  2. If we add one more agent, who owns which files or state?
  3. After separation, do verification steps actually decrease, or do they increase?
  4. Can different agents reliably share the latest decision?
  5. If conflict appears, is the final authority clearly assigned?

If these questions are hard to answer, it is usually a sign that the structure should stay simpler for now.

Conclusion

Single-agent, subagent, and multi-agent setups often get framed like maturity levels. In practice, they are not. More agents do not automatically mean a higher stage of design. The real issue is where the system chooses to place complexity.

Good separation does not imitate teamwork for appearance. It makes accountability clearer, verification cheaper, and context collisions visible earlier. By that standard, many teams do not need a bigger agent structure next. They need sharper ownership boundaries and cheaper verification design.

Related Internal Links

References

  • Anthropic, Create custom subagents
  • Anthropic, How Claude remembers your project
  • Anthropic, Configure permissions
  • OpenAI Agents SDK, Agents
  • OpenAI Agents SDK, Handoffs
  • OpenAI Agents SDK, Guardrails
  • drafts/blog/260519_ํ•˜๋„ค์Šค๋ถ€๋กE04_Subagent์™€AgentTeams๊ตฌ๋ถ„ํ•˜๊ธฐ_๋ธ”๋กœ๊ทธ.md
  • drafts/blog/260519_ํ•˜๋„ค์Šค์‹œ๋ฆฌ์ฆˆD02_ํ•˜๋„ค์Šค์„ค๊ณ„์˜7๊ฐ€์ง€๊ฒฐ์ •_๋ธ”๋กœ๊ทธ.md

Series overview: Series index

๋Œ“๊ธ€

์ด ๋ธ”๋กœ๊ทธ์˜ ์ธ๊ธฐ ๊ฒŒ์‹œ๋ฌผ

Agent Memory Engine (2/10) — Building an AI Agent Memory System with SQLite Alone

"ML Foundations (9/9) — PyTorch vs TensorFlow, and the Road to Local LLMs"

"RAG Core Study (14/26) — Evaluation Sets with RAGAS & DeepEval"

"ML Foundations (8/9) — Deep Learning Architectures: CNN, RNN, Attention"

"ML Foundations (7/9) — Deep Learning Training: Optimizers, Regularization, Initialization"

OpenClaw to Hermes Migration (2/13) — What to Preserve, Partially Port, or Discard

AI Agents I Built (5/7) — Building an Automated Blogger API Publishing System