Ontology and Memory Systems (4/13) — Designing an AI Agent Memory System: 23 Components

AI 에이전트가 스스로 진화하는 구조를 만들었다 — 자기개선 루프 설계기

5 native + 6 custom + 12 automation scripts: the full memory architecture


핵심 요약

  • The memory system comprises 23 components: native (5) + custom (6) + automation scripts (12)
  • The Reflect pipeline cut API token usage by 96%; Confidence Decay automatically attenuates stale opinions
  • If the operational timeline order (Reflect before Session Reset) is violated, data loss occurs
1. 스킬 자기개선 루프 — 에이전트가 스킬을 스스로 만든다

Background

Long-running AI agents live or die by their memory management. Accumulate everything and tokens explode; delete indiscriminately and context is lost. This is the full architecture of the memory system I built on top of an AI agent framework.

The Components

5 Native Features (Config-Level)

  • Memory Flush: Saves conversations to markdown at a 6,000-token threshold
  • Compaction (safeguard mode): 24,000-token minimum reservation to preserve persona
  • Session Reset/Maintenance: Daily reset at 4 AM, automatic deletion after 30 days
  • Context Pruning: Clears caches unused for 1 hour
  • Memory Search (Hybrid): Vector search (0.7 weight) + text matching (0.3 weight)
4. 관찰에서 검증까지 — 변증법 모델링

6 Custom Features

  • Retain Tags: Classify memory entries as W (knowledge), B (experience), O (opinion + confidence score), S (entity state)
  • Bank 4-Tier Structure: Identity, Curated, Project, and Episode layers
  • Reflect Pipeline: Local LLM first-pass processing, cloud LLM final judgment. Achieved 96% token reduction (277k down to 5k)
  • Topic-Cued Recall: Sub-100ms real-time keyword-based memory retrieval
  • Confidence Decay: Opinion confidence scores decay by -0.02 daily; entries below 0.30 are auto-deleted
  • Resolution Levels: Inactive entities are progressively compressed from L2 (full) to L1 (summary) to L0 (title only)

12 Automation Scripts

retain-merge.py, memory-archive.sh, conflict-apply.py, and 9 others handle validation, cleanup, and cache management. All Python/bash-based, so the additional operational cost is zero.

Pitfalls and Caveats

The operational timeline order is critical. The Reflect pipeline must run at 03:00, followed by Session Reset at 04:00. If this order is reversed, session data gets wiped before Reflect has a chance to process it.

Time-based mathematical decay (like Confidence Decay) should always be implemented as scripts, not LLM judgments — scripts don't hallucinate.

Takeaway

The core of a memory system isn't "what to remember" — it's "what to forget." Twenty-three components sounds like a lot, but layering them as native config, custom logic, and automation scripts keeps the system manageable.

댓글

이 블로그의 인기 게시물

Agent Memory Engine (2/10) — Building an AI Agent Memory System with SQLite Alone

"ML Foundations (9/9) — PyTorch vs TensorFlow, and the Road to Local LLMs"

"RAG Core Study (14/26) — Evaluation Sets with RAGAS & DeepEval"

"ML Foundations (8/9) — Deep Learning Architectures: CNN, RNN, Attention"

"ML Foundations (7/9) — Deep Learning Training: Optimizers, Regularization, Initialization"

OpenClaw to Hermes Migration (2/13) — What to Preserve, Partially Port, or Discard

AI Agents I Built (5/7) — Building an Automated Blogger API Publishing System