"Harness Engineering Basics (3/4) — Why Instruction Structure and Context Design Matter More Than Longer Prompts"

5월 18, 2026

Many teams treat agent quality as a prompt-writing problem: should the prompt be longer, more detailed, more explicit? In practice, the more important problem is often structural. Which rules stay always visible, and which information is fetched only when needed? Good instruction files behave less like encyclopedias and more like signposts.

Key Takeaways

Prompt engineering improves one input message. Context design determines what information enters the work loop, when, and at what level.
AGENTS.md, CLAUDE.md, skills, and handoff notes should not all do the same job. Separate shared rules, runtime rules, reusable procedures, and changing state.
Context is a scarce resource, so a better design is usually show less by default, fetch more on demand.
Long instruction files often feel safer, but they can bury the critical rules and raise the cost of every session start.
In practice, instruction quality is often less about wording and more about placement.

1. Why structure matters more than longer prompts

As Part 2 showed, agents do not live in a one-shot question-answer pattern. They operate across multiple turns, reading material, calling tools, and revising plans. That makes the assembly of inputs more important than any single sentence.

So context design is not just prompt engineering with a bigger budget. It is a higher-level operating design that includes prompts inside it.

Question	Long-prompt mindset	Context-design mindset
Goal	explain everything at once	stage the right information by layer
Assumption	more detail helps	too much detail can bury the signal
Failure mode	bloated, conflicting instruction	needed material arrives too late or not at all
Fix	add more wording	split structure, reference, load on demand

Strong agents are therefore less like "agents with better prompts" and more like agents with better information routing.

2. Instruction files are different layers, not interchangeable documents

The reading notes and series design document make the same point: instruction, context, and memory should not all be piled into one file. They have different roles.

In this repository, a four-layer split is a natural minimum.

Layer	Example	Role
Shared rules	`AGENTS.md`	project boundaries, hard constraints, role policy
Runtime rules	`CLAUDE.md` family	workflow habits, priorities, execution style
Reusable procedures	skills, checklists, templates	encapsulated repeatable processes
Changing state	`tasks/plan.md`, `tasks/handoffs/`, `tasks/sessions/`	current progress and re-entry state

This is not document taxonomy for its own sake. It affects performance directly.

Shared rules should be stable.
Runtime rules should guide behavior without becoming huge.
Reusable procedures can be loaded only when relevant.
Changing state should persist, but not always be injected.

If all of these are mixed together, the agent struggles to distinguish "must always know" from "only matters right now."

3. Good instruction files are maps, not encyclopedias

One of the most reusable ideas from the earlier drafts and notes is simple: AGENTS.md and CLAUDE.md should function more like maps than encyclopedias.

Why?

3.1 Long always-on files raise the cost of every session

If a document is loaded every time, its size becomes recurring overhead.

3.2 Critical rules get buried

The one rule that truly matters disappears inside twenty paragraphs of secondary explanation.

3.3 Changing material makes the file stale quickly

Once temporary state, exceptions, and one-off details get mixed in, the document stops being trustworthy.

Good instruction files usually share these traits:

short
clearly scoped
explicit about priorities and prohibitions
detailed procedures moved into separate docs or skills
changing state kept in handoff or plan artifacts

Their purpose is not to explain everything. It is to keep the agent from getting lost.

4. Context should be split into "always visible" and "load when needed"

Good context design is not about maximum volume. It is about placement. The simplest useful split is:

Type	What belongs there	Design rule
Always-visible context	core role, hard constraints, output rules, current objective	keep it short and stable
Load-when-needed context	detailed docs, prior session notes, large references, bulky outputs	expose by reference and fetch on demand

This matters because agents do not use everything equally well at once. In many cases, the stronger design is to put only the essential material front and center, while leaving the rest accessible by path or tool.

The repository structure here already follows that pattern.

current active work in tasks/plan.md
memory navigation in docs/memory-map.md
prior session state in tasks/sessions/
durable boundaries in AGENTS.md

That separation makes new-session re-entry much easier.

5. What belongs where

The most common practical question is straightforward: "Which file should hold this information?"

In practice, the following split is simple and durable.

Put this in `AGENTS.md`

project-wide prohibitions
role policy
language rules
hard boundaries that should never be crossed

Put this in the `CLAUDE.md` layer

workflow order
first documents to read
editing and verification habits
default tool-usage rules

Put this in skills or separate docs

long procedures
infrequent workflows
tasks that need many examples
operational knowledge that should load only when relevant

Put this in `tasks/` artifacts

current progress state
re-entry points for the next session
open risks and unresolved questions

The cleaner this separation is, the easier it becomes for the agent to stay focused on the current turn.

6. Symptoms of bad context design

When context design goes wrong, the model may look inconsistent, but the root cause is often structural. Common symptoms include:

6.1 The agent keeps missing rules

The key rule is likely buried or duplicated across conflicting files.

6.2 The output becomes unnecessarily long

If the input is long and priorities are unclear, the model also struggles to compress the right things.

6.3 Every session feels slow to warm up

Your always-loaded instruction layer may be too large, or it may include material that is not needed right now.

6.4 Prior work gets re-derived again and again

If you rely on conversation history instead of handoff artifacts, re-entry becomes expensive and unreliable.

7. Practical checklist for instruction structure

At the beginner level, a few structural questions are more useful than any framework.

Does this rule need to be always visible, or only loaded when needed?
Is this project-wide guidance, or temporary task state?
Should this be repeated as a long instruction, or separated into a reusable skill?
Is this document helping because it is long, or hiding the important rule because it is long?
Could the next session resume from this structure without re-deriving everything?

Those five questions alone often clean up a large part of an agent environment.

8. Placement often matters more than wording

This does not mean prompt engineering is useless. It means prompt quality is not enough in an agent environment. Even strong wording loses force inside a bad structure.

The reverse is also true. With a good structure, short instructions often become stronger.

keep top-level rules short and fixed
separate changing state into ledgers or handoffs
move long procedures into skills or reference docs
design for on-demand loading

That is the shift from "prompt engineering" toward "harness engineering."

References

docs/blog_series_하네스엔지니어링_총괄_design.md
sources/260518_하네스엔지니어링_15장_블로그활용노트.md
drafts/blog/260429_harness_series_02_context_engineering_en.md
WikiDocs, Chapter 3 notes from 하네스 엔지니어링 백과사전

This is Part 3 of the Harness Engineering Basics series. Next: MCP, tool engineering, and how to design the tool surface deliberately.

Series overview: Harness Engineering Series Guide

이 블로그 검색

MaJu Tech Notes