Agent Self-Improvement Harness (12/12) — Why Router

Agent Self-Improvement Harness (12/12) — Why Router_Control Was Never Integrated

4월 08, 2026

Build common infrastructure yourself vs. delegate to an external framework — a routing-layer design decision

핵심 요약

Router_Control is an OpenAI-compatible proxy (/v1/chat/completions, port 3939) designed to auto-route requests across models using 14-dimension complexity scoring, targeting 70–80% cost reduction.
By the time the design and code were complete, the parent framework (Hermes) already provided a routing layer at the same abstraction level. The self-built proxy was never integrated and remains on HOLD.
What this post covers: (1) the 14-dimension complexity classification schema, (2) the architectural decision criterion "don't build common infrastructure — borrow it," and (3) how the decision not to integrate affected migration count.

Router_Control Architecture

Router_Control is structured as an OpenAI-compatible proxy. Clients only need to change OPENAI_BASE_URL.

Endpoint: /v1/chat/completions
Port: 3939
Behavior: receive request → score complexity across 14 dimensions → map score range to model → forward to that model
Target effect: 70–80% cost reduction (simpler requests downshift to cheaper models)

The key design point: classification logic and transport layer are coupled inside a single process. As discussed below, this coupling becomes a decisive variable in the integration decision.

14-Dimension Complexity Classification Schema

Each dimension is scored 0–3. The total (or weighted sum) maps to a model tier.

Input token count
Code / natural-language ratio
Multi-language presence
Request type (question / generation / transformation)
Expected output length
Domain specificity
Estimated reasoning depth
Tool-call requirement
Context dependency
Response-time sensitivity
User label (experiment / production)
Historical outcome for the same pattern
Token-cache suitability
Output determinism requirement

The concept is straightforward. When the sum (or weighted sum) falls below the threshold, the request goes to a low-cost model; above it, to a high-cost model. Weights are expected to be fine-tuned on production data.

Rationale for Holding Integration

At the time design was complete, two conditions held simultaneously.

Parent framework has an equivalent layer. Hermes' smart_model_routing.py performs routing at the same abstraction level. The number of classification dimensions differs, but the routing layer itself is identical.
Doubled migration cost. Integrating Router_Control into OpenClaw and then porting it to Hermes means transplanting the routing layer twice: first OpenClaw → Router_Control, then Router_Control → smart_model_routing.py.

Under these conditions, holding integration produces lower total porting cost. Router_Control stopped at design-and-code-complete status and remains on HOLD.

What Integration Would Have Cost

Assumption: Router_Control had been integrated into OpenClaw first.

11 locations in OpenClaw would depend on Router_Control. Each dependency becomes an individual modification target during the Hermes move.
Moving to Hermes requires rewriting the smart_model_routing.py adapter to match the 14-dimension schema. The existing adapter cannot be reused.
Result: the surviving asset is not the 14-dimension classification schema but the OpenAI-compatible proxy implementation — a domain where external frameworks win decisively.

Asset Separation Principle

The generalizable form of this decision:

Decision logic (14-dimension classification, weights): domain-specific knowledge. An asset. Maintain as documentation and policy files.
Transport layer (OpenAI-compatible proxy, forwarding): generic infrastructure. Delegate to an external framework.

For Hermes integration: code porting = 0 lines, policy porting = approximately 20 lines (moving the 14-dimension weights into smart_model_routing.py's routing policy file as prompt heuristics).

Limitations and Unmeasured Items

The costs accepted by staying unintegrated are explicit.

The 70–80% cost reduction is a target. During the period when the self-built proxy was not running, this reduction did not materialize on the Router_Control side.
Hermes' smart_model_routing.py achieves a similar effect through different heuristics, but the exact reduction rate has not yet been measured (verification required).
Per-dimension weights in the 14-dimension schema require post-tuning on production data. The current version is an initial estimate.

Applicable Conditions

This pattern is reusable when:

A self-developed infrastructure component operates at the same abstraction level as an existing layer in an external framework.
Two porting steps need to be avoided — when the cost of the first port has a significant chance of voiding the second.
The decision logic and transport layer of the self-built component are separable. (If they are not separable, building it yourself may be justified in the first place.)

Open Questions

Of the 14 dimensions, how many actually contribute to cost reduction? If only 3–4 are effective, the schema can be simplified.
How much does policy ported as prompt heuristics degrade compared to code-based routing? If degradation is significant, patching the external framework's routing layer directly may be a better path.
Where is the boundary of "borrow common infrastructure"? How does this principle apply to cases where decision logic and transport layer are mixed and difficult to separate?

Series overview: Series index

이 블로그 검색

MaJu Tech Notes