Agent Memory Engine (3/10) — memcore Completed: 18 Modules, 3,300 Lines
20 Tables, 8 CLI Commands, 295 Curated Rows, and Graceful Degradation
ํต์ฌ ์์ฝ
- memcore is a library that rebuilds a file-based memory pipeline on top of a single SQLite file. 18 modules, approximately 3,300 lines.
- 8 CLI commands cover the full operational lifecycle, each schedulable with a single cron entry: migrate / lint / warn / decay / wiki-lint / stats / backfill-vectors / ontology-sync.
- The core design principle is graceful degradation — when a dependency is absent, functionality degrades rather than the system halting.
What You Will Take Away
- Module partitioning and table design for a SQLite single-file agent memory engine
- Hybrid prefetch strategy that falls back to FTS5 when vector search is unavailable
- Pattern of delegating memory lifecycle operations (decay, validation, cleanup) to CLI commands and cron
- How to decouple memory implementation from the host runtime using the MemoryProvider interface
Scope — Differences from the Initial Build
The initial version of memcore included 9 tables, 51 tests, and a migration path from the file-based bank/ structure. The "completed form" covered in this article adds 20 tables, a full CLI command set, a vector search module, and ontology synchronization. In other words, the target is a version where operational lifecycle and fallback strategies are embedded inside the engine itself, not just the storage layer.
Module Structure (18 Modules)
| # | Module | Responsibility |
|---|---|---|
| 1 | core |
SQLite connection, WAL mode, schema initialization |
| 2 | ingest |
retain-extract / retain-merge unification |
| 3 | prefetch |
FTS5 hybrid search (topic → FTS5 → LIKE) |
| 4 | dialectic |
U-tag 3-phase (observe → hypothesize → verify) |
| 5 | decay |
opinions confidence decay (0.02/day, remove < 0.30) |
| 6 | lint |
Retain tag format validation |
| 7 | entities |
Entity staleness detection (30 days) |
| 8 | topics |
Topic registry + M2M relationships |
| 9 | bank_migrate |
bank/ → SQLite conversion (READ-ONLY, --incremental) |
| 10 | decisions |
LLM decision queue (TOPIC_CLASSIFY, CONFLICT_RESOLVE, CHANGELOG_SUMMARIZE) |
| 11 | archive |
memory/ → archived/ relocation |
| 12 | housekeeping |
recall TTL, session cleanup, stale orphan removal |
| 13 | vectors |
sqlite-vec + bge-m3 embeddings (optional) |
| 14 | wiki |
Karpathy LLM Wiki pattern (topic page CRUD) |
| 15 | wiki_lint |
7 wiki checks (contradiction / stale / orphan / gap / size / citation / frontmatter) |
| 16 | ontology |
CLAUDE.md → DB one-way cache (agent / relation / persona layers) |
| 17 | promotion |
Cascade promotion (local → lessons → global) |
| 18 | stats |
Statistics / health check |
Responsibilities are partitioned in the following order: storage (core) → write (ingest/promotion) → read (prefetch) → cleanup (decay/housekeeping/archive) → validation (lint/wiki_lint) → auxiliary (vectors/wiki/ontology) → observability (stats). Each module has no knowledge of another module's internals; it shares only the connection object and public schema exposed by core.
20 Tables
| Group | Table | Purpose |
|---|---|---|
| Meta | meta | Schema version, configuration |
| Knowledge | curated, curated_fts | Core knowledge + FTS5 index |
| Topics | topics, topic_curated | Topic registry + M2M mapping |
| Entities | entities | Per-project state |
| Episodes | episodes | Daily log |
| Identity | identity | MEMORY.md cache |
| U-tag | u_patterns, u_observations, u_hypotheses, u_verifications | Dialectic 3-phase |
| Decisions | decisions | LLM decision queue |
| Wiki | wiki_pages, wiki_sources, wiki_log | Karpathy wiki |
| Vectors | vec_curated | sqlite-vec embeddings (optional) |
| Ontology | ont_agents, ont_relations, ont_layers | CLAUDE.md cache |
| Promotions | promotions | local → lessons → global history |
The tables split into three conceptual layers: fact-recording layer (curated, episodes, entities, identity), reasoning-process layer (u_patterns family, decisions, promotions), and auxiliary index layer (curated_fts, vec_curated, topic_curated). The schema ensures the fact-recording layer remains independently queryable even when the auxiliary index layer is empty.
8 CLI Commands
memcore migrate # bank/ → SQLite conversion
memcore lint # Retain tag + data integrity check
memcore warn # Memory warning report
memcore decay # Run opinions confidence decay
memcore stats # Statistics / health check
memcore wiki-lint # 7 wiki checks
memcore backfill-vectors # Bulk sqlite-vec embedding generation
memcore ontology-sync # CLAUDE.md → DB synchronization
The CLI is a mechanism for pushing maintenance tasks that can be decoupled from the runtime to the outside. Operations like decay, lint, and stats must run periodically but do not need to run inside the conversation loop. Delegating them to cron/launchd reduces the runtime's responsibility to prefetch and ingest only. The system gains a self-maintenance cycle without an orchestration framework.
Current Data State
| Item | Count |
|---|---|
| curated rows | 295 |
| topics | 15 |
| entities | 4 |
| topic_curated links | 514 |
| Distribution | knowledge 150 / pattern 28 / daily 17 / world 14 / identity 13 / experience 10 / opinion 4 |
The fact that topic_curated links (514) are approximately 1.74× curated rows (295) indicates that each item is linked to an average of 1–2 topics. Multi-topic linking, not single tagging, is the default form. The distribution shows knowledge-type entries accounting for over half. The low opinion count is indirect evidence that decay is operating as intended, removing low-confidence entries.
Core Design Principle — Graceful Degradation
memcore is designed so that code paths remain valid when a dependency is absent.
- No sqlite-vec: falls back to FTS5 text search. Semantic search degrades to keyword search, but the path remains.
- No bge-m3: vector build step is skipped. prefetch operates on FTS5 alone.
- No CLAUDE.md: the ontology cache is empty; no other module is affected.
- No wiki pages:
wiki_lintreturns "0 pages checked" and exits normally.
The practical advantage of this approach is tolerance for installation profile variance. The full-spec configuration is Apple Silicon + oMLX + sqlite-vec; the minimum configuration requires only Python + SQLite. Deploying the same codebase across different environments reduces branching install scripts and replaces them with in-runtime feature flags.
Interface with the Host System — MemoryProvider
memcore couples to the host runtime exclusively through the MemoryProvider abstract interface. The implementation is MemcoreProvider(MemoryProvider).
prefetch→memcore.prefetch.searchon_session_end→memcore.ingeston_memory_write→memcore.promotion(passes Tier gate)system_prompt_block→ injects top-N curated entries into context
The host runtime has no knowledge of memcore's existence. It only needs to know the method signatures of the interface. This boundary ensures the host code does not change when the storage layer is replaced or upgraded. The same principle applies to vector backend replacement (sqlite-vec → Qdrant/FAISS) or memory engine experimentation (running two implementations in parallel A/B).
Limitations and Applicability
- Single-file SQLite: appropriate for agent scenarios with low concurrent write volume. Not suited for multi-writer workloads.
- Embedding cost:
backfill-vectorsis optimized for batch processing. Streaming or real-time embedding requires a separate path. - One-way ontology cache: reverse synchronization from DB back to CLAUDE.md is out of scope. Markdown is the source of truth.
- CLI-driven operations: environments without cron/launchd (e.g., pure serverless) require an external scheduler.
The applicable scope can be summarized as: "local execution, single writer, agent environments with high dependency variance that require an observable and stable memory store."
Open Questions
- How should the consistency model be defined when multiple agents reference the same memcore file concurrently?
- Should the decay and promotion policy constants (0.02/day, removal threshold < 0.30, etc.) adapt based on corpus size?
- If the storage layer is replaced with a non-SQLite backend (e.g., key-value store + external FTS) while preserving the
MemoryProvidercontract, which features degrade?
3,300 lines represents the scale of "one year's accumulated memory judgment logic redistributed onto single-file storage." The substance of the file-based → SQLite transition is not a format change. It is the ability to reduce each operational lifecycle step to a single CLI invocation and delegate it to an external scheduler.
Series overview: Series index
๋๊ธ
๋๊ธ ์ฐ๊ธฐ