Agent Self-Improvement Harness (12/12) — Why Router_Control Was Never Integrated

Build common infrastructure yourself vs. delegate to an external framework — a routing-layer design decision


ํ•ต์‹ฌ ์š”์•ฝ

  • Router_Control is an OpenAI-compatible proxy (/v1/chat/completions, port 3939) designed to auto-route requests across models using 14-dimension complexity scoring, targeting 70–80% cost reduction.
  • By the time the design and code were complete, the parent framework (Hermes) already provided a routing layer at the same abstraction level. The self-built proxy was never integrated and remains on HOLD.
  • What this post covers: (1) the 14-dimension complexity classification schema, (2) the architectural decision criterion "don't build common infrastructure — borrow it," and (3) how the decision not to integrate affected migration count.

Router_Control Architecture

Router_Control is structured as an OpenAI-compatible proxy. Clients only need to change OPENAI_BASE_URL.

  • Endpoint: /v1/chat/completions
  • Port: 3939
  • Behavior: receive request → score complexity across 14 dimensions → map score range to model → forward to that model
  • Target effect: 70–80% cost reduction (simpler requests downshift to cheaper models)

The key design point: classification logic and transport layer are coupled inside a single process. As discussed below, this coupling becomes a decisive variable in the integration decision.

14-Dimension Complexity Classification Schema

Each dimension is scored 0–3. The total (or weighted sum) maps to a model tier.

  1. Input token count
  2. Code / natural-language ratio
  3. Multi-language presence
  4. Request type (question / generation / transformation)
  5. Expected output length
  6. Domain specificity
  7. Estimated reasoning depth
  8. Tool-call requirement
  9. Context dependency
  10. Response-time sensitivity
  11. User label (experiment / production)
  12. Historical outcome for the same pattern
  13. Token-cache suitability
  14. Output determinism requirement

The concept is straightforward. When the sum (or weighted sum) falls below the threshold, the request goes to a low-cost model; above it, to a high-cost model. Weights are expected to be fine-tuned on production data.

Rationale for Holding Integration

At the time design was complete, two conditions held simultaneously.

  1. Parent framework has an equivalent layer. Hermes' smart_model_routing.py performs routing at the same abstraction level. The number of classification dimensions differs, but the routing layer itself is identical.
  2. Doubled migration cost. Integrating Router_Control into OpenClaw and then porting it to Hermes means transplanting the routing layer twice: first OpenClaw → Router_Control, then Router_Control → smart_model_routing.py.

Under these conditions, holding integration produces lower total porting cost. Router_Control stopped at design-and-code-complete status and remains on HOLD.

What Integration Would Have Cost

Assumption: Router_Control had been integrated into OpenClaw first.

  • 11 locations in OpenClaw would depend on Router_Control. Each dependency becomes an individual modification target during the Hermes move.
  • Moving to Hermes requires rewriting the smart_model_routing.py adapter to match the 14-dimension schema. The existing adapter cannot be reused.
  • Result: the surviving asset is not the 14-dimension classification schema but the OpenAI-compatible proxy implementation — a domain where external frameworks win decisively.

Asset Separation Principle

The generalizable form of this decision:

  • Decision logic (14-dimension classification, weights): domain-specific knowledge. An asset. Maintain as documentation and policy files.
  • Transport layer (OpenAI-compatible proxy, forwarding): generic infrastructure. Delegate to an external framework.

For Hermes integration: code porting = 0 lines, policy porting = approximately 20 lines (moving the 14-dimension weights into smart_model_routing.py's routing policy file as prompt heuristics).

Limitations and Unmeasured Items

The costs accepted by staying unintegrated are explicit.

  • The 70–80% cost reduction is a target. During the period when the self-built proxy was not running, this reduction did not materialize on the Router_Control side.
  • Hermes' smart_model_routing.py achieves a similar effect through different heuristics, but the exact reduction rate has not yet been measured (verification required).
  • Per-dimension weights in the 14-dimension schema require post-tuning on production data. The current version is an initial estimate.

Applicable Conditions

This pattern is reusable when:

  • A self-developed infrastructure component operates at the same abstraction level as an existing layer in an external framework.
  • Two porting steps need to be avoided — when the cost of the first port has a significant chance of voiding the second.
  • The decision logic and transport layer of the self-built component are separable. (If they are not separable, building it yourself may be justified in the first place.)

Open Questions

  • Of the 14 dimensions, how many actually contribute to cost reduction? If only 3–4 are effective, the schema can be simplified.
  • How much does policy ported as prompt heuristics degrade compared to code-based routing? If degradation is significant, patching the external framework's routing layer directly may be a better path.
  • Where is the boundary of "borrow common infrastructure"? How does this principle apply to cases where decision logic and transport layer are mixed and difficult to separate?

Series overview: Series index

๋Œ“๊ธ€

์ด ๋ธ”๋กœ๊ทธ์˜ ์ธ๊ธฐ ๊ฒŒ์‹œ๋ฌผ

Agent Memory Engine (2/10) — Building an AI Agent Memory System with SQLite Alone

"ML Foundations (9/9) — PyTorch vs TensorFlow, and the Road to Local LLMs"

"RAG Core Study (14/26) — Evaluation Sets with RAGAS & DeepEval"

"ML Foundations (8/9) — Deep Learning Architectures: CNN, RNN, Attention"

"ML Foundations (7/9) — Deep Learning Training: Optimizers, Regularization, Initialization"

OpenClaw to Hermes Migration (2/13) — What to Preserve, Partially Port, or Discard

AI Agents I Built (5/7) — Building an Automated Blogger API Publishing System