Agent Memory Engine (7/10) — Multi-Agent Shared Knowledge Infrastructure: wikycore

RRF Hybrid Search + Explicit Skill Registry to Eliminate Redundant Work


ํ•ต์‹ฌ ์š”์•ฝ

  • When agents operate solely with their own CLAUDE.md, they solve the same problems repeatedly. This post covers the design and mechanics of two shared infrastructure components that eliminate that redundancy.
  • wikycore is a shared document store (SQLite, home server hosted) that provides hybrid search combining vector embeddings (bge-m3, 1024-dim) and FTS5 keyword search via RRF.
  • Knowledge Graph is a skill registry that answers "who is the expert for this domain." It separates auto-inferred registration from explicit registration (docPath required) to maintain credibility.
  • What you can take from this post: how to run hybrid search without per-component threshold tuning, operational rules that prevent inflation in self-assessed registries, and a principled boundary between shared and private memory.

Problem Definition — Missing Entry Point for Shared Memory

When running multiple agents in parallel, the real cost driver is not model tokens — it is redundant work. One agent solves an OAuth token refresh flow; another implements the same logic from scratch. One agent resolves a sqlite-vec macOS build failure; another searches the identical error message independently.

The root cause is not the absence of shared memory but the absence of a discoverable entry point into shared memory. An agent can document findings in its own docs/, but if no external agent can locate that path, the knowledge stays private. This entry-point problem is split along two axes.

System Question it answers Unit
wikycore "Is there a document covering this topic?" Content
Knowledge Graph "Who is the registered expert for this domain?" Agent

wikycore — Shared Document Store with Hybrid Search

A wiki built on SQLite, hosted on the home server (Mac mini, Tailscale internal network). Current state:

Metric Value
Docs Multiple
Chunks Multiple
Embedding model bge-m3 (oMLX, 1024-dim, fp16)
Audit log Cumulative
DB size Lightweight (low MB range)

Failure Modes of Vector-Only and Keyword-Only Search

Failure patterns observed in production are symmetric across two directions.

  • Vector-only: A query for "OAuth token refresh" misses a document titled "credential rotation." Semantically close, but when synonym coverage is insufficient, cosine distance does not cross the threshold.
  • Keyword-only: A query for "BM25" matches only documents containing the exact string "BM25." Adjacent-concept documents such as "hybrid search" or "rank fusion" are missed.

RRF (Reciprocal Rank Fusion) Combination

Results from both retrievers are merged via Reciprocal Rank Fusion. Each document's rank in each retriever is converted to a score of 1 / (k + rank) and summed. Because only ranks are used rather than directly adding vector similarity scores and BM25 scores that operate on different scales, single-threshold tuning is unnecessary — the principal advantage of RRF.

API interface:

POST /api/wiki/search
{"q": "OAuth token refresh", "k": 10, "mode": "hybrid"}

mode accepts hybrid (default), vector, or keyword. The latter two are for debugging and quality validation; production traffic routes through hybrid.

Knowledge Graph — Skill Registry Structure

Where wikycore indexes content, the Knowledge Graph indexes "who has registered expertise in a given domain." Current state:

Metric Value
Registered skills Multiple
Registered agents Multiple teams
Category distribution ai, mobile, content, devops, frontend, backend, misc, infra

Auto-Registration vs. Explicit Registration

Skill registration is split into two tiers.

  • Auto-registration: Inferred from languages and frameworks a project uses. Metadata at the level of "this project uses Flutter."
  • Explicit registration: Requires expert-level claim plus a docPath field. A claim at the level of "implemented Hive encryption pipeline; documentation is at this path."

Explicit registration examples:

Agent Explicitly Registered Skills
Minesweeper firebase-auth, flutter-hive-encryption, linear-design-tokens
Video Production Director shorts-production-pipeline, comfyui-remote-image-gen, ffmpeg-ken-burns-clip
Routing Manager llm-multi-provider-routing, llm-proxy-optimization, oauth-token-lifecycle
Write Director research-verification, blogger-publish-pipeline, content-wiki-synthesis

Query Flow

When a new problem arrives, the first step is to read the docPath of agents registered as experts for that domain — adding one lookup step before beginning independent implementation. The pattern holds under the assumption that lookup cost is lower than re-implementation cost.

Application Pattern — Real Query Flow

Adding ComfyUI integration to a blog publish skill illustrates the call sequence:

  1. wikycore search: "comfyui SDXL Tailscale" → returns the Video Production Director's comfyui-windows-setup.md slug.
  2. Knowledge Graph query: comfyui-remote-image-gen skill owner = Video Production Director, level expert.
  3. Read that agent's docPath and reuse the call pattern directly.
  4. Newly acquired knowledge is registered back to wikycore → future queries resolve at step 1.

The Knowledge Graph designates the source of the document to read; wikycore stores the document itself. The two systems are in a pointer-to-content relationship.

Operational Rules — Common Judgment Points

Shared (wikycore) vs. Private (docs/) Boundary

Principle: if another agent would find value in a tool, utility, or pattern, it belongs in wikycore; if it is project-internal convention or a work log, it belongs in docs/. When the judgment is ambiguous, reduce it to: "Would this be meaningful if a neighboring agent read it?" If no, keep it private.

Level Inflation in Self-Assessed Registries

Explicit registration is inherently self-assessed. To prevent level inflation, one rule was added to the operational policy: expert level requires the file referenced by docPath to exist. Registration metadata can be overstated, but the content of the referenced file is verifiable.

Document Staleness

A wiki document is a snapshot of facts at the time of writing. Code changes. The stated rule: when memory conflicts with code, trust the code. The wiki is a starting point, not a conclusion. Without this rule, an outdated wiki entry actively points agents in the wrong direction.

Scope and Open Questions

This structure is valid under the following conditions:

  • Multiple agents run in parallel with partially overlapping work domains.
  • A continuously available node exists to host the shared store (home server, VPS, or internal network).
  • Each agent has the permissions and network path to call the shared API.

Conversely, for a single-agent setup or a configuration where domains are entirely independent, the overhead may exceed the benefit.

Open questions:

  • What signals should automate wiki staleness detection — document last-accessed timestamp, recent commit date of linked code paths, or something else?
  • What signals beyond self-assessment can calibrate Knowledge Graph level values — can cross-agent reference frequency serve as a corrective signal?
  • Is per-domain tuning of the RRF k parameter worth the complexity, or is a single fixed value sufficient?

Series overview: Series index

๋Œ“๊ธ€

์ด ๋ธ”๋กœ๊ทธ์˜ ์ธ๊ธฐ ๊ฒŒ์‹œ๋ฌผ

Agent Memory Engine (2/10) — Building an AI Agent Memory System with SQLite Alone

"ML Foundations (9/9) — PyTorch vs TensorFlow, and the Road to Local LLMs"

"RAG Core Study (14/26) — Evaluation Sets with RAGAS & DeepEval"

"ML Foundations (8/9) — Deep Learning Architectures: CNN, RNN, Attention"

"ML Foundations (7/9) — Deep Learning Training: Optimizers, Regularization, Initialization"

OpenClaw to Hermes Migration (2/13) — What to Preserve, Partially Port, or Discard

AI Agents I Built (5/7) — Building an Automated Blogger API Publishing System