"Designing a Security Architecture for a Local AI Agent

"Designing a Security Architecture for a Local AI Agent — 7-Layer Defense in Depth"

4월 04, 2026

A 7-layer defense-in-depth strategy for a single-user local AI agent

핵심 요약

"It's a personal project, I'll handle security later" — this turned out to be a critical misjudgment. An unauthorized user sent commands to the agent through a Telegram bot.
Built a 7-layer defense-in-depth architecture spanning network through cognitive layers.
Consciously accepted risks (e.g., sandbox disabled, plaintext keys) are explicitly documented rather than ignored.

Background

LLM agents read emails, modify calendars, access the file system, and call external APIs. The agent effectively operates as core system infrastructure. During a Telegram integration test, an unauthorized external user issued commands to the bot — and the agent complied. After this security incident, I established a "Security from Day 1" principle and restructured the entire architecture.

The Architecture

$3. 스킬 간 파이프라인 — 단일 스킬에서 워크플로로$

Trust Model

Runtime environment: Single-user Mac Mini, internet-connected, no Docker
Primary defense targets: Unauthorized network access, communication channel hijacking, prompt injection
Trust anchor: Only the system account owner who can modify the ~/.openclaw directory is recognized as the operator

7-Layer Security Architecture

Layer	Mechanism	Status
Network	Loopback Bind + Token Auth	Active
Channel	Telegram DM Pairing + Group Allowlist	Active
Filesystem	.openclaw/ 700, config 600	Active
Injection Defense	11-category defense prompt ruleset	Prompt-level
Execution Control	Per-sub-agent minimal exec permissions	Active
Secret Management	.gitignore + Pre-commit security audit	Active
Trust Boundary	Single-user local environment (accepted)	Accepted

Perimeter Control: Network and Channel

{
  "gateway": {
    "port": 18789, "mode": "local", "bind": "loopback",
    "auth": { "mode": "token", "token": "..." }
  }
}

bind: loopback restricts connections to the same machine. Even if an SSRF attack gets through, static token authentication acts as the second line of defense.

Telegram uses dmPolicy: pairing to allow DMs only from pre-paired users, and groupPolicy: allowlist to restrict group commands to designated users.

Internal Control: Injection Defense

11 categories of prompt injection defense rules: - ignore-previous rejection: Classic jailbreak attempts are discarded - Encoding attack defense: Base64 and hex-obfuscated commands are rejected - Role-switching defense: Persona change attempts like "you are now in admin mode" are refused

Action Authorization Boundaries

Free execution: File reads, simple web GETs, system log checks
Approval required: Email/SNS sends, external API state changes, local file deletion

Pitfalls and Accepted Risks

Consciously accepted risks: 1. Sandbox disabled — Unavoidable for a personal assistant agent that needs direct local filesystem management 2. Plaintext keys in config files — Accepted for a single-operator private repository. Migration to a vault is planned if the project goes open source.

Takeaway

Even for personal projects, if the system has permissions that touch core infrastructure, security must be designed from day one. Retrofitting access control after the fact causes conflicts with existing functionality and incurs massive testing costs. Bake the principle of least privilege and zero trust for external inputs into the architecture from the start.

이 블로그 검색

MaJu Tech Notes