Ontology and Memory Systems (4/13) — Designing an AI Agent Memory System: 23 Components
5 native + 6 custom + 12 automation scripts: the full memory architecture
핵심 요약
- The memory system comprises 23 components: native (5) + custom (6) + automation scripts (12)
- The Reflect pipeline cut API token usage by 96%; Confidence Decay automatically attenuates stale opinions
- If the operational timeline order (Reflect before Session Reset) is violated, data loss occurs
Background
Long-running AI agents live or die by their memory management. Accumulate everything and tokens explode; delete indiscriminately and context is lost. This is the full architecture of the memory system I built on top of an AI agent framework.
The Components
5 Native Features (Config-Level)
- Memory Flush: Saves conversations to markdown at a 6,000-token threshold
- Compaction (safeguard mode): 24,000-token minimum reservation to preserve persona
- Session Reset/Maintenance: Daily reset at 4 AM, automatic deletion after 30 days
- Context Pruning: Clears caches unused for 1 hour
- Memory Search (Hybrid): Vector search (0.7 weight) + text matching (0.3 weight)
6 Custom Features
- Retain Tags: Classify memory entries as W (knowledge), B (experience), O (opinion + confidence score), S (entity state)
- Bank 4-Tier Structure: Identity, Curated, Project, and Episode layers
- Reflect Pipeline: Local LLM first-pass processing, cloud LLM final judgment. Achieved 96% token reduction (277k down to 5k)
- Topic-Cued Recall: Sub-100ms real-time keyword-based memory retrieval
- Confidence Decay: Opinion confidence scores decay by -0.02 daily; entries below 0.30 are auto-deleted
- Resolution Levels: Inactive entities are progressively compressed from L2 (full) to L1 (summary) to L0 (title only)
12 Automation Scripts
retain-merge.py, memory-archive.sh, conflict-apply.py, and 9 others handle validation, cleanup, and cache management. All Python/bash-based, so the additional operational cost is zero.
Pitfalls and Caveats
The operational timeline order is critical. The Reflect pipeline must run at 03:00, followed by Session Reset at 04:00. If this order is reversed, session data gets wiped before Reflect has a chance to process it.
Time-based mathematical decay (like Confidence Decay) should always be implemented as scripts, not LLM judgments — scripts don't hallucinate.
Takeaway
The core of a memory system isn't "what to remember" — it's "what to forget." Twenty-three components sounds like a lot, but layering them as native config, custom logic, and automation scripts keeps the system manageable.
댓글
댓글 쓰기