OpenClaw to Hermes Migration (1/13) — Current Structure Inventory: Snapshot Before Migration
Single-persona interface + backend role separation, 4-tier memory, cron / LaunchAgent / script layering — an anatomy of what a solo AI system is actually made of.
What This Post Covers
- The design pattern of separating concerns into a single conversational persona and multiple backend agents
- The criteria for dividing workloads across cron, macOS LaunchAgent, and the script layer
- A 4-tier memory structure (Identity / Curated / Project / Episode) that manages token budget and recall quality independently
- The actual configuration values for oMLX-based hybrid search (embedding + keyword + MMR + temporal decay)
- From a migration perspective, the criteria that distinguish "reusable assets" from "platform-coupled assets"
OpenClaw started as a solo AI system built on Claude Code and has since grown to include multiple agents, multiple schedulers and daemons, multiple operational scripts, several skills, and a 4-tier memory architecture. Ahead of the decision to migrate to a Hermes base, this inventory provides a complete overhead view of the current system. The structure is documented so that anyone building or referencing a similar system can understand exactly which role belongs to which layer.
1. Agent Layer — Single Persona + Role Separation
Multiple agents exist behind the scenes, but the user-facing interface is fixed to a single agent: mir. All others operate as backend roles. This design allows (a) a single conversational context to be maintained while (b) heavy workloads are delegated to dedicated agents, each with independently tuned models and prompts.
| Agent | Model | Role |
|---|---|---|
| mir | gpt-5.4 (high) | Sole conversational interface, routing |
| monitor | omlx gemma-4 | System status monitoring |
| researcher | gemini-3.1-flash-lite | Web research, AutoSearch Deep, Scrapling |
| communicator | gpt-5.4 (medium) | Writing |
| orchestrator | gpt-5.3-codex | Coding + CLI, sub-agent/skill factory |
| gemini-3.1-flash-lite | Google Workspace API | |
| reviewer | gemini-3.1-flash-lite | Monthly memory review |
| memory-runner | omlx gemma-4 | bank/ file I/O |
| memory-manager | gemini-2.5-flash | Reflect semantic judgment |
| telegram-ops | gemini-3.1-flash-lite | Telegram delivery |
| planner | mir-inherited (subagent) | PRD / planning |
Model Selection Criteria
- Heavy judgment / routing: GPT-5.4 high
- Coding and CLI: GPT-5.3 codex
- Lightweight tasks (research, delivery, review): Gemini 3.1 Flash Lite
- Local security (never sent over the network): oMLX gemma-4 26b
Model assignment is determined by two axes only: inference cost and security boundary. Fixing the conversational interface to a single agent ensures that model heterogeneity is never exposed to the user experience.
2. Automation Layer — Cron
Scheduled batch work runs on cron. Jobs fall into two main categories.
| name | schedule | agent |
|---|---|---|
| ai-research-scan | 09:00 | researcher |
| reddit-scan | every 15 min / 6 hr | researcher |
| reflect-orchestrate | 03:00 | memory-manager |
| memory-micro-cycle | :07 and :37 every hour | mir |
| daily-summary | 22:40 | mir |
| self-review-scan | 22:05 | mir |
| self-review-apply | 22:12 | mir |
| self-review-fix | every 10 min / 22–23h | mir |
| monthly-memory-review | 1st of month, 09:00 | reviewer |
How It Works
- Memory cycle: 03:00 Reflect (semantic merge and consolidation), micro-cycle at :07 and :37 each hour (short-interval housekeeping). Aligns bank/ and memory/ during hours when the user is not active.
- Self-review: 22:05 scan → 22:12 apply → fix loop from 22:00 to 23:00. Rule violations and drift are self-corrected at the end of each day.
The cron layer is where the principle of "the system consolidates and corrects itself while the user is asleep" is implemented.
3. Daemon Layer — macOS LaunchAgent
Where cron is time-based, LaunchAgent is process-based. Anything that must remain alive goes here. Representative daemons:
- ai.openclaw.gateway — Gateway daemon (port 18789)
- ai.openclaw.update-check — Automatic updates
- com.openclaw.omlx — oMLX server (gemma-4 26b)
- com.openclaw.omlx-proxy — oMLX proxy
- com.openclaw.daily-audit — 05:00 daily audit (Phase A–E)
- com.claude-agent.watchdog — Session watchdog
- com.claude-agent.autostart — Auto-start
The oMLX server and proxy form the local embedding and inference infrastructure. Data that must not leave the machine is routed exclusively through these two endpoints. The minimum unit for attaching local inference is: server + proxy + routing configuration pointing to the proxy.
4. Execution Layer — Operational Scripts
This is the largest layer by file count, with clearly defined functional categories.
Memory Pipeline
- retain-extract.py / retain-merge.py — Retain tag extraction and merging
- recall-tree.py / recall-match.py — Topic-cued recall
- recall-cleanup.py — TTL cleanup
- confidence-decay.py — Confidence decay for opinions
- topics-validate.py / topics-expand.py — Topic consistency
- session-cleanup.py / session-archive.py — Session lifecycle
- user-pattern-stage.py — U-tag dialectic (observation → hypothesis → verification)
- bank-lint.py / bank-size-watch.py — bank/ validation
- entity-audit.py — entities/ staleness detection
- memory-micro-cycle.py — 30-minute cycle orchestrator
- decision-prepare.py / decision-apply.py — LLM decision queue
- memory-warning-report.py / memory-optimize.py
- memory-archive.sh — memory/ → archived/
Heartbeat
- heartbeat-router.sh / heartbeat-update.sh / heartbeat-tick.py / heartbeat-health.py
- proactive-exception-alerts.py
- check-quotas.sh
- gws-check.sh — Gmail/Calendar quick check
Self-Review
- self-review-prescan.py / self-review-apply.py / self-review-fix.py
- skill-candidate-stage.py
Daily Audit
- daily-audit.sh / docs-snapshot.sh / release-check.sh / docs-update-ai.sh
- agent-linter.sh / config-drift.sh / cron-analytics.sh / session-audit.sh / reflect-trace.py
External Integration
- scrape.py — Scrapling + playwright + curl_cffi (Cloudflare bypass)
- gws-wrapper.sh — Google Workspace CLI
- omlx-serve.sh — oMLX daemon
Hooks (.claude/settings)
- PostToolUse: JSON validation, AgentLinter, DOC_SYNC, debug warnings
- PreToolUse (git commit): DOC_SYNC gate, AgentLinter
- SessionStart / SessionEnd / PreCompact
Scripts follow a one-task-per-file principle. Cron, LaunchAgent, and hooks call these scripts as entry points. The upper layers decide "when and why"; scripts handle only "what."
5. Interface Layer — Skills
Slash-command skills are the entry points the user invokes directly.
- General: brainstorming, writing-plans, verification, deep-interview, code-review, testing, git-commit, project-doctor, self-audit
- OpenClaw-specific: oc-config, oc-agent, oc-channel, oc-skill, oc-memory, oc-doctor, oc-deploy, oc-backup
Classification from a Migration Perspective
- General skills are platform-independent — portable to any other harness as-is.
oc-*skills are coupled to OpenClaw's internal structure (agent composition, channel routing, bank/ schema) and must be redesigned.
This binary classification is the first-pass filter for migration scope.
6. Memory Architecture — v0.6 4-Tier
This is the core design and the most carefully managed asset in the system. Four tiers separate concerns: token budget management and recall quality are handled independently.
- Identity:
MEMORY.md(600 tokens) — Identity file, always injected at the front of the context window - Curated:
bank/(world / experience / opinions / patterns / _changelog / _conflict_log / _map / index / topics) - Project:
bank/entities/ - Episode:
memory/(daily logs)
Search Stack
- Embedding: oMLX
bge-m3-mlx-fp16 - Hybrid weighting: vector 0.7 / keyword 0.3
- Reranking: MMR
- Temporal correction: temporalDecay 30 days
Reflect
- Multi-phase pipeline
- U-tag dialectic (observation → hypothesis → verification)
Design Rationale
- Identity is "always injected" → hard-capped at 600 tokens.
- Curated is "retrieved based on query context" → vector + keyword + MMR ensures diversity; temporal decay weights recency.
- Episode is "raw log" → rarely read directly; Reflect promotes entries to Curated.
- Project is entity-scoped isolation → per-entity staleness detection (entity-audit) is possible.
The 4-tier design physically separates four distinct roles: what goes into context / what is retrieved / what is a promotion candidate / what is the source of record. When these roles are conflated, tokens are wasted or recall is contaminated with noise.
7. Channel Routing
Notifications are split by purpose.
- Discord main channel (1487837031420792832): all reflect and self-review events
- Telegram Forum topics: General / Server / Bot / Research / Memory / Heartbeat
When all events converge on a single channel, they become noise and notifications get ignored. Splitting by topic means the choice of which channel to monitor is itself the decision about "what must not be missed." The routing concept is not platform-coupled and is likely to carry over intact after migration.
Limitations and Scope
- Layer count is complexity: The multi-layer structure — agent / cron / LaunchAgent / script / hook / skill — is near the upper bound of what a single person can maintain. Beyond this scale, layers must be consolidated.
- oMLX local inference assumes macOS Apple Silicon: Implementing the same structure on a different OS requires replacing the local inference stack.
- Cron + LaunchAgent is macOS-specific: Moving to Linux allows a 1:1 substitution with systemd timers and units, but when the scheduler changes, the failure-recovery behavior of the self-review loop must be re-verified as well.
- Reusable vs. platform-coupled: Role separation principle, 4-tier memory, channel routing, and general skills are portable.
oc-*skills, the Gateway daemon, and hook configuration require redesign per harness.
Open Questions
- Can the 4-tier memory be reduced further — what is lost by collapsing to a 2-tier Identity + Curated structure?
- In the single-persona + backend-separation model, how does routing overhead scale when the persona count increases from one to N?
- For the Hermes migration, which assets carry over unchanged, which require redesign, and which are discarded — is the classification criterion "platform coupling" or "design debt"?
Series overview: Series index
๋๊ธ
๋๊ธ ์ฐ๊ธฐ