Multi-Agent System Design — Role Separation, Integration Patterns, and Real ROI

Architectural principles that surface when operating multiple agents — what structures hold, and what collapses

Key Takeaways

In a multi-agent system, ROI depends on clarity of role separation, not on the number of agents
Adding agents without an orchestrator causes management overhead to grow exponentially, not linearly
Synergy between agents does not emerge on its own — data flows and integration points must be explicitly defined

Background

One agent becomes two. Two becomes five. Before long, multiple agents are running in parallel. Each was created for a specific reason, but stepping back and viewing the system as a whole reveals patterns. Some agents deliver value every day. Others were built and rarely used.

This post documents the role classification model, integration patterns, and ROI evaluation methodology developed from running a multi-agent system in production.

Body

1. Role Classification

Agents fall into four categories by function.

Infrastructure Layer — provides the operational foundation for all other agents. Central orchestrator, dashboard, and monitoring belong here. Without this layer, every other agent operates in isolation.

Engine Layer — provides the technical substrate that other agents consume: local AI processing, LLM routing, prompt harnesses. Not directly user-facing, but determines system-wide performance.

Domain Layer — handles work in a specific field. Content writing, stock analysis, narrative writing — agents with clearly defined deliverables.

App Layer — produces end-user-facing artifacts such as mobile apps. ROI is latent during development and only materializes post-launch.

2. ROI Assessment — Which Agents Generated Value

High ROI:

Content agent: Automated blog publishing and image generation. Work that would have taken hundreds of hours manually is now automated.
Infrastructure orchestrator: Serves as the hub for all agents — handling monitoring, dashboard, and message routing. Without it, every other agent is isolated.
Prompt harness: Standardized CLAUDE.md, skills, and memory structures across all agents. Reduced new agent setup time from hours to minutes.

Moderate ROI:

Local AI engine: Accumulated operational knowledge. High in value, moderate in daily utilization.
Think-tank agent: Valuable for decision support and countering confirmation bias, but quantitative ROI is difficult to measure. A category where long-term value is high.
Signal analysis agent: Automated signal collection and scoring. Whether it contributes to actual outcomes requires a longer validation window.

Low ROI:

Single-project agents: Objective achieved, but nothing left to do after completion. The inherent limitation of project-scoped versus always-on agents.
Suspended agents: Intentionally halted. Current ROI is zero; value can resume if reactivated.

Deferred Judgment:

App development agents: ROI can spike sharply once an app ships and acquires users, but these are still in development.

3. Integration Patterns — Synergy Must Be Designed

Evaluating agents individually misses the inter-agent integration layer.

Content pipeline: Think-tank agent → content agent. Life experiences become blog material; research findings feed back as decision references. Bidirectional data flow reinforces both ends.

Infrastructure hub pattern: Central orchestrator → all agents. A single dashboard makes agent status visible at a glance. Issue detection time drops significantly.

Shared technology stack: Agents on the same stack share UI and state management patterns. A problem solved in one project transfers immediately to another.

These patterns do not emerge without explicit upfront design. The core question for integration design: "Whose output becomes whose input?" Define this early.

4. Failures and Lessons

Agents scaled too fast: Early on, the bar for adding an agent was "if we need it, build it." Above a certain count, management overhead compounded rapidly. The current rule: build a new agent only when existing agents cannot solve the problem.

The importance of orchestration, realized too late: Individual agent management was viable at small scale. Beyond that threshold, a central orchestrator became non-negotiable — and that realization came later than it should have. The orchestrator should be built first, not last.

The reality of migration: Moving a running system to a new architecture is more complex than expected. In practice, migrating from OpenClaw to Hermes triggered a token runaway problem. After reverting to OpenClaw, Hermes is being redesigned and is currently under validation. This is the reason the Strangler Fig pattern — incrementally validating a new system on top of the existing one — is the correct approach.

Documentation debt: Agents were built fast, and the reasoning behind decisions was not recorded. Reconstructing context later consumed unnecessary time. An ontology and memory system is the direct solution to this problem.

5. Evaluate Value, Not Agent Count

"What is the right number of agents?" is a question that comes up often. The honest answer: there is no right number.

More useful questions: - Does each agent's value exceed its management cost? - Are agents generating synergy, or are they isolated? - Is strengthening an existing agent more efficient than building a new one?

The count is not the metric. What matters is continuously evaluating each agent's value and adjusting honestly. Sunsetting an agent that no longer delivers is itself a system management decision.

Closing

The most important lesson from operating a multi-agent system: building agents is easier than running them. Creation takes a short burst of effort. Sustaining consistent value requires continuous evaluation and adjustment.

Do not attempt to design the perfect system upfront. Build what is needed, evaluate its value, and adjust. Build the orchestrator first. Design integration patterns explicitly. Validate migrations incrementally. That iteration is what develops the system.

이 블로그 검색

MaJu Tech Notes