"Why I Ditched One All-Purpose AI for Nine Specialists — A Multi-Agent Design Story"
A role-based multi-agent architecture that solves context pollution and security vulnerabilities
핵심 요약
- A single-agent setup failed within three days due to context pollution and security gaps
- Tasks were split into four categories — coordination, execution, research, writing — forming a nine-agent system
- Model assignment strategy, fallback chains, and least-privilege access controls tackle both cost and security
Background
When I first set up my AI agent platform, I assigned everything to a single agent. Email checks, web research, code generation, memory maintenance — one agent handled it all.
It failed in three days. Two critical problems surfaced:
- Context Pollution — Massive web-crawl results lingered in memory while the agent drafted email replies, causing the model to confuse contexts and waste tokens.
- No Permission Boundaries — File-system access for memory management and external web searches require fundamentally different permissions. Giving one agent blanket tool access was a security time bomb.
The Design
Classification: Four Categories
A clear taxonomy was needed to divide agents. All tasks map to one of four categories:
| Category | Characteristics | Required Tools |
|---|---|---|
| Coordination | Request analysis, agent delegation | sessions_spawn, exec, read, write |
| Execution | Code generation, CLI, automation | read, write, edit, exec |
| Research | Web search, data collection | web_search, web_fetch, read |
| Writing | Documentation, email drafts | read, write, edit |
Adding dedicated roles for memory management and monitoring brought the total to nine agents.
Nine Agents at a Glance
| Agent ID | Role | Model Tier |
|---|---|---|
| mir | Coordinator, user interface | Premium |
| monitor | System health checks, pattern detection | Cheap/Free |
| researcher | Web research, trend scanning | Balanced |
| communicator | Content creation, email/document drafting | Premium |
| orchestrator | Coding, CLI automation, skill building | Balanced |
| Google Workspace API execution (Gmail, Calendar, Drive) | Cheap/Free | |
| memory-manager | Reflect pipeline orchestration | Balanced |
| memory-runner | Reflect execution (spawned by manager) | Free (Local) |
| reviewer | Monthly memory review and summarization | Cheap |
Model Assignment Strategy
Different tasks get different models:
- mir (Premium) — Needs top-tier models for accurate intent parsing
- researcher (Balanced) — Requires fast throughput for large-volume text summarization, so speed-optimized models take priority
- memory-runner (Free) — Simple repetitive file I/O runs entirely on a local model
Fallback Chains and Security
Three principles guard against cloud API failures:
- Provider Diversification — Alternate across multiple providers so a single outage never takes down the system
- Performance-Based Degradation — Start with the highest-performance model and step down to cheaper alternatives on failure
- Local Safety Net — A local model serves as the last line of defense, ensuring minimum functionality even during internet outages
Security follows least-privilege: the monitor agent gets only read and exec permissions; the researcher gets web_search and web_fetch but no file-system access. Agent spawning is controlled through explicit allowlists.
Lessons Learned
Too few agents and you get context pollution plus security holes. Too many — splitting "email read-only" from "email write-only," for instance — and management overhead explodes. The sweet spot is dividing by role-sized units: research, memory management, monitoring, and so on.
Takeaway
A nine-agent system organized around coarse-grained roles keeps contexts clean while delivering both security and cost efficiency. Trading the convenience of a single agent bought system-wide stability and scalability.
댓글
댓글 쓰기