OpenClaw to Hermes Migration (1/13) — Current Structure Inventory: Snapshot Before Migration

Single-persona interface + backend role separation, 4-tier memory, cron / LaunchAgent / script layering — an anatomy of what a solo AI system is actually made of.


What This Post Covers

  • The design pattern of separating concerns into a single conversational persona and multiple backend agents
  • The criteria for dividing workloads across cron, macOS LaunchAgent, and the script layer
  • A 4-tier memory structure (Identity / Curated / Project / Episode) that manages token budget and recall quality independently
  • The actual configuration values for oMLX-based hybrid search (embedding + keyword + MMR + temporal decay)
  • From a migration perspective, the criteria that distinguish "reusable assets" from "platform-coupled assets"

OpenClaw started as a solo AI system built on Claude Code and has since grown to include multiple agents, multiple schedulers and daemons, multiple operational scripts, several skills, and a 4-tier memory architecture. Ahead of the decision to migrate to a Hermes base, this inventory provides a complete overhead view of the current system. The structure is documented so that anyone building or referencing a similar system can understand exactly which role belongs to which layer.


1. Agent Layer — Single Persona + Role Separation

Multiple agents exist behind the scenes, but the user-facing interface is fixed to a single agent: mir. All others operate as backend roles. This design allows (a) a single conversational context to be maintained while (b) heavy workloads are delegated to dedicated agents, each with independently tuned models and prompts.

Agent Model Role
mir gpt-5.4 (high) Sole conversational interface, routing
monitor omlx gemma-4 System status monitoring
researcher gemini-3.1-flash-lite Web research, AutoSearch Deep, Scrapling
communicator gpt-5.4 (medium) Writing
orchestrator gpt-5.3-codex Coding + CLI, sub-agent/skill factory
google gemini-3.1-flash-lite Google Workspace API
reviewer gemini-3.1-flash-lite Monthly memory review
memory-runner omlx gemma-4 bank/ file I/O
memory-manager gemini-2.5-flash Reflect semantic judgment
telegram-ops gemini-3.1-flash-lite Telegram delivery
planner mir-inherited (subagent) PRD / planning

Model Selection Criteria

  • Heavy judgment / routing: GPT-5.4 high
  • Coding and CLI: GPT-5.3 codex
  • Lightweight tasks (research, delivery, review): Gemini 3.1 Flash Lite
  • Local security (never sent over the network): oMLX gemma-4 26b

Model assignment is determined by two axes only: inference cost and security boundary. Fixing the conversational interface to a single agent ensures that model heterogeneity is never exposed to the user experience.


2. Automation Layer — Cron

Scheduled batch work runs on cron. Jobs fall into two main categories.

name schedule agent
ai-research-scan 09:00 researcher
reddit-scan every 15 min / 6 hr researcher
reflect-orchestrate 03:00 memory-manager
memory-micro-cycle :07 and :37 every hour mir
daily-summary 22:40 mir
self-review-scan 22:05 mir
self-review-apply 22:12 mir
self-review-fix every 10 min / 22–23h mir
monthly-memory-review 1st of month, 09:00 reviewer

How It Works

  • Memory cycle: 03:00 Reflect (semantic merge and consolidation), micro-cycle at :07 and :37 each hour (short-interval housekeeping). Aligns bank/ and memory/ during hours when the user is not active.
  • Self-review: 22:05 scan → 22:12 apply → fix loop from 22:00 to 23:00. Rule violations and drift are self-corrected at the end of each day.

The cron layer is where the principle of "the system consolidates and corrects itself while the user is asleep" is implemented.


3. Daemon Layer — macOS LaunchAgent

Where cron is time-based, LaunchAgent is process-based. Anything that must remain alive goes here. Representative daemons:

  • ai.openclaw.gateway — Gateway daemon (port 18789)
  • ai.openclaw.update-check — Automatic updates
  • com.openclaw.omlx — oMLX server (gemma-4 26b)
  • com.openclaw.omlx-proxy — oMLX proxy
  • com.openclaw.daily-audit — 05:00 daily audit (Phase A–E)
  • com.claude-agent.watchdog — Session watchdog
  • com.claude-agent.autostart — Auto-start

The oMLX server and proxy form the local embedding and inference infrastructure. Data that must not leave the machine is routed exclusively through these two endpoints. The minimum unit for attaching local inference is: server + proxy + routing configuration pointing to the proxy.


4. Execution Layer — Operational Scripts

This is the largest layer by file count, with clearly defined functional categories.

Memory Pipeline

  • retain-extract.py / retain-merge.py — Retain tag extraction and merging
  • recall-tree.py / recall-match.py — Topic-cued recall
  • recall-cleanup.py — TTL cleanup
  • confidence-decay.py — Confidence decay for opinions
  • topics-validate.py / topics-expand.py — Topic consistency
  • session-cleanup.py / session-archive.py — Session lifecycle
  • user-pattern-stage.py — U-tag dialectic (observation → hypothesis → verification)
  • bank-lint.py / bank-size-watch.py — bank/ validation
  • entity-audit.py — entities/ staleness detection
  • memory-micro-cycle.py — 30-minute cycle orchestrator
  • decision-prepare.py / decision-apply.py — LLM decision queue
  • memory-warning-report.py / memory-optimize.py
  • memory-archive.sh — memory/ → archived/

Heartbeat

  • heartbeat-router.sh / heartbeat-update.sh / heartbeat-tick.py / heartbeat-health.py
  • proactive-exception-alerts.py
  • check-quotas.sh
  • gws-check.sh — Gmail/Calendar quick check

Self-Review

  • self-review-prescan.py / self-review-apply.py / self-review-fix.py
  • skill-candidate-stage.py

Daily Audit

  • daily-audit.sh / docs-snapshot.sh / release-check.sh / docs-update-ai.sh
  • agent-linter.sh / config-drift.sh / cron-analytics.sh / session-audit.sh / reflect-trace.py

External Integration

  • scrape.py — Scrapling + playwright + curl_cffi (Cloudflare bypass)
  • gws-wrapper.sh — Google Workspace CLI
  • omlx-serve.sh — oMLX daemon

Hooks (.claude/settings)

  • PostToolUse: JSON validation, AgentLinter, DOC_SYNC, debug warnings
  • PreToolUse (git commit): DOC_SYNC gate, AgentLinter
  • SessionStart / SessionEnd / PreCompact

Scripts follow a one-task-per-file principle. Cron, LaunchAgent, and hooks call these scripts as entry points. The upper layers decide "when and why"; scripts handle only "what."


5. Interface Layer — Skills

Slash-command skills are the entry points the user invokes directly.

  • General: brainstorming, writing-plans, verification, deep-interview, code-review, testing, git-commit, project-doctor, self-audit
  • OpenClaw-specific: oc-config, oc-agent, oc-channel, oc-skill, oc-memory, oc-doctor, oc-deploy, oc-backup

Classification from a Migration Perspective

  • General skills are platform-independent — portable to any other harness as-is.
  • oc-* skills are coupled to OpenClaw's internal structure (agent composition, channel routing, bank/ schema) and must be redesigned.

This binary classification is the first-pass filter for migration scope.


6. Memory Architecture — v0.6 4-Tier

This is the core design and the most carefully managed asset in the system. Four tiers separate concerns: token budget management and recall quality are handled independently.

  • Identity: MEMORY.md (600 tokens) — Identity file, always injected at the front of the context window
  • Curated: bank/ (world / experience / opinions / patterns / _changelog / _conflict_log / _map / index / topics)
  • Project: bank/entities/
  • Episode: memory/ (daily logs)

Search Stack

  • Embedding: oMLX bge-m3-mlx-fp16
  • Hybrid weighting: vector 0.7 / keyword 0.3
  • Reranking: MMR
  • Temporal correction: temporalDecay 30 days

Reflect

  • Multi-phase pipeline
  • U-tag dialectic (observation → hypothesis → verification)

Design Rationale

  • Identity is "always injected" → hard-capped at 600 tokens.
  • Curated is "retrieved based on query context" → vector + keyword + MMR ensures diversity; temporal decay weights recency.
  • Episode is "raw log" → rarely read directly; Reflect promotes entries to Curated.
  • Project is entity-scoped isolation → per-entity staleness detection (entity-audit) is possible.

The 4-tier design physically separates four distinct roles: what goes into context / what is retrieved / what is a promotion candidate / what is the source of record. When these roles are conflated, tokens are wasted or recall is contaminated with noise.


7. Channel Routing

Notifications are split by purpose.

  • Discord main channel (1487837031420792832): all reflect and self-review events
  • Telegram Forum topics: General / Server / Bot / Research / Memory / Heartbeat

When all events converge on a single channel, they become noise and notifications get ignored. Splitting by topic means the choice of which channel to monitor is itself the decision about "what must not be missed." The routing concept is not platform-coupled and is likely to carry over intact after migration.


Limitations and Scope

  • Layer count is complexity: The multi-layer structure — agent / cron / LaunchAgent / script / hook / skill — is near the upper bound of what a single person can maintain. Beyond this scale, layers must be consolidated.
  • oMLX local inference assumes macOS Apple Silicon: Implementing the same structure on a different OS requires replacing the local inference stack.
  • Cron + LaunchAgent is macOS-specific: Moving to Linux allows a 1:1 substitution with systemd timers and units, but when the scheduler changes, the failure-recovery behavior of the self-review loop must be re-verified as well.
  • Reusable vs. platform-coupled: Role separation principle, 4-tier memory, channel routing, and general skills are portable. oc-* skills, the Gateway daemon, and hook configuration require redesign per harness.

Open Questions

  • Can the 4-tier memory be reduced further — what is lost by collapsing to a 2-tier Identity + Curated structure?
  • In the single-persona + backend-separation model, how does routing overhead scale when the persona count increases from one to N?
  • For the Hermes migration, which assets carry over unchanged, which require redesign, and which are discarded — is the classification criterion "platform coupling" or "design debt"?

Series overview: Series index

๋Œ“๊ธ€

์ด ๋ธ”๋กœ๊ทธ์˜ ์ธ๊ธฐ ๊ฒŒ์‹œ๋ฌผ

Agent Memory Engine (2/10) — Building an AI Agent Memory System with SQLite Alone

"ML Foundations (9/9) — PyTorch vs TensorFlow, and the Road to Local LLMs"

"RAG Core Study (14/26) — Evaluation Sets with RAGAS & DeepEval"

"ML Foundations (8/9) — Deep Learning Architectures: CNN, RNN, Attention"

"ML Foundations (7/9) — Deep Learning Training: Optimizers, Regularization, Initialization"

OpenClaw to Hermes Migration (2/13) — What to Preserve, Partially Port, or Discard

AI Agents I Built (5/7) — Building an Automated Blogger API Publishing System