Skill System Design — Modularizing Agent Capabilities with Keyword Triggers

Mac Mini M4에서 ollama를 200% 활용하는 최적화 아키텍처

SKILL.md structure, keyword trigger tables, and pipeline composition — a practitioner's guide


Summary

  • A skill separates one agent capability into an independent module. One skill file owns one responsibility.
  • A keyword trigger table maps natural-language requests to skills automatically.
  • Connecting skills into a pipeline turns complex tasks into reproducible workflows.

Background

CLAUDE.md can tell an agent what to do. Putting how to do it into CLAUDE.md creates problems: procedures grow long, the file becomes bloated, and every session loads procedures that are irrelevant to the current task.

The skill system solves this. Each how lives in its own file and is loaded only when needed.

A mature harness carries skills for research, compilation, verification, blog publishing, Twitter drafts, deep interviews, brainstorming, code review, project diagnostics, testing, writing plans, and self-audit, among others. Each exists as an independent SKILL.md file.


ollama 프로바이더 기본 설정

Body

1. SKILL.md File Structure — Three Required Elements

Every skill file lives at .claude/skills/{skill-name}/SKILL.md. The structure is consistent.

Frontmatter: skill metadata

---
name: research
description: "Web research and information gathering for source file creation.

Trigger: 리서치, 조사, 검색, 찾아줘, research"
user-invocable: true
---

name is the unique identifier. description is what the agent reads to determine whether this skill applies. Keywords listed under Trigger are matched against user input to activate the skill. user-invocable: true allows direct invocation by the user.

Procedure: execution steps

This is the core of the skill. It specifies what to do at each step and what judgment criteria to apply. For the research skill:

1. Parse topic → confirm intent → bound scope
2. Search loop (max 3 rounds)
   - Round 1: initial search (secure 3+ independent sources)
   - Round 2+: resolve ambiguities, cross-verify
3. Classify information (fact / interpretation / estimate / opinion)
4. Counter-scenario review
5. Write search log

Explicit procedure means the agent performs the task at the same quality every time. The instruction is not "figure it out" — it is "in this order, by these criteria."

Output: result format

Specifies what format the skill's output must conform to. Without this, the agent produces different shapes of output on each run. Fixed output format is required to connect a skill to the next step in a pipeline.

2. Keyword Trigger Table — Mapping Natural Language to Skills

When a user says "look into this topic for me," which skill should the agent invoke?

A keyword trigger table in CLAUDE.md handles this mapping automatically:

| Keyword | Skill | Path |
|---|---|---|
| 리서치, 조사, 검색, 찾아줘, research | research | .claude/skills/research/SKILL.md |
| 정리, 컴파일, 소스 작성, 자료 만들어 | compile | .claude/skills/compile/SKILL.md |
| 검증, 팩트체크, 확인, 사실 확인 | verify | .claude/skills/verify/SKILL.md |
| 블로그, 게시, 발행, publish | publish | .claude/skills/publish/SKILL.md |

Design decisions worth noting:

  • Register both Korean and English keywords. The user's language choice is unpredictable.
  • Anchor keywords on verbs. "조사해줘" and "찾아줘" appear more often in natural requests than the noun "리서치" alone.
  • No overlapping keywords. If "확인" maps to two skills, the agent will be ambiguous. Each keyword resolves to exactly one skill.

Because this table lives in CLAUDE.md, the agent reads it automatically at session start. Skill activation requires only keyword matching — no separate routing logic.

3. Inter-Skill Pipelines — From Individual Skills to Workflows

Each skill is useful on its own. Connecting skills turns complex tasks into reproducible pipelines.

The core pipeline in operation:

research → compile → verify → publish

Each skill's output is the next skill's input: - Research output (raw gathered material) → Compile structures it into a source file - Compiled source file → Verify runs a four-stage fact-check - Verified source → Publish converts it to a blog post and submits it

Pipelines are defined as presets in CLAUDE.md:

| Preset | Pipeline |
|---|---|
| research | deep-interview → research → compile → verify → publish |
| quick-draft | executor → verify |
| review | code-review → verify |

The value of presets is eliminating decision cost. There is no need to reason about the processing order for each request. A complex research request follows the research preset.

4. When to Split a Skill — Separation Criteria

Too granular and maintenance overhead grows. Too coarse and skills lose their meaning. Clear criteria are required.

Signals to split:

  • Can it run independently? The verify skill can run on an existing source file with no prior research. If independent execution is meaningful, it belongs in its own skill.
  • Is it reused across pipelines? verify appears in both the research pipeline and the review pipeline. Reuse is a strong signal to separate.
  • Does the procedure exceed five steps? Longer procedures increase the probability that the agent skips intermediate steps. Splitting is safer.

Signals to merge:

  • Always executed together, and standalone execution is meaningless.
  • Both operations share identical input and output.
시행착오 / 주의사항

5. Skill Usage Tracking — memory-map.md

Skills that are built but never triggered accumulate as dead weight. A usage tracking table in memory-map.md surfaces this:

## Skill Usage Tracking
| Skill | Last Used | Frequency |
|---|---|---|
| research | 2026-04-03 | frequent |
| compile | 2026-04-03 | frequent |
| verify | 2026-04-02 | frequent |
| publish | 2026-04-01 | moderate |
| deep-interview | 2026-03-30 | occasional |
| brainstorming | — | unused |

An unused skill is one of two things: the trigger keywords are wrong, or the skill was never needed in the first place.

A rule in CLAUDE.md ensures last_used and count update automatically each time a skill triggers. Manual tracking does not persist.

6. Skill Group Responsibilities

Skills divide into three functional groups:

Group Representative Skills Role
Content Pipeline research, compile, verify Research → source file structuring → fact-check. Core workflow.
Publishing & Delivery publish, twitter-draft Blog API submission, Twitter draft delivery. Exit skills.
Design & Diagnostics deep-interview, brainstorming, writing-plans, project-doctor, and others Planning, interviewing, review, self-audit. Judgment-support skills.

Because each skill owns one responsibility, individual skill files stay short — averaging 50–90 lines.


Lessons Learned

  • Starting without skills and loading all procedures into CLAUDE.md is a common mistake. Research procedure, verification procedure, and publishing procedure all in CLAUDE.md pushed the file past 400 lines, and the agent began ignoring procedures near the bottom. After extracting to skills, CLAUDE.md dropped to the 100-line range. Only the relevant procedure loads at the relevant time.

  • Keyword collisions emerge sooner than expected. "확인" could have matched both verify and compile. Every new keyword must be checked against the existing table before registration.

  • Omitting "why" from a skill file produces mechanical compliance. A skill with procedure but no stated purpose cannot adapt when the situation calls for judgment. The Purpose section is not optional.

  • Pipelines do not need to be fully sequential. Simple requests follow the quick-draft preset and skip intermediate skills. Forcing every request through a four-stage pipeline is over-engineering.


Conclusion

The core of a skill system is separation and composition. Procedures are extracted from CLAUDE.md into independent skill files, matched to user requests via keyword triggers, and connected into pipelines that make complex workflows reproducible.

The structural advantage is extensibility. Adding a new capability requires one new skill file and one new row in the trigger table. Existing skills and CLAUDE.md require no modification.

An agent's capability is not determined by the length of CLAUDE.md. It is determined by the number and quality of its skills.

댓글

이 블로그의 인기 게시물

Agent Memory Engine (2/10) — Building an AI Agent Memory System with SQLite Alone

"ML Foundations (9/9) — PyTorch vs TensorFlow, and the Road to Local LLMs"

"RAG Core Study (14/26) — Evaluation Sets with RAGAS & DeepEval"

"ML Foundations (8/9) — Deep Learning Architectures: CNN, RNN, Attention"

"ML Foundations (7/9) — Deep Learning Training: Optimizers, Regularization, Initialization"

OpenClaw to Hermes Migration (2/13) — What to Preserve, Partially Port, or Discard

AI Agents I Built (5/7) — Building an Automated Blogger API Publishing System