Agent Operations Design Notes (6/9) — Where to Draw the Line Between Allow, Ask, and Deny
The moment an agent can read files, edit them, run commands, and send something outside the system, quality and safety stop being separate concerns. Many teams still treat permission design as a simple choice between blocking and allowing. In practice, the more useful question is different: what should be allowed by default, what should stop for human approval, and what should be structurally forbidden?
ํต์ฌ ์์ฝ
- The core of permission design is not merely having three buckets called
allow,ask, anddeny. It is having a consistent rule for which actions belong in which bucket. - The same broad label of
tool usehides very different risks. Reading, editing, executing, externally sending, and publishing should not live in one policy bucket. - Anthropic's Claude Code permission model and the OpenAI Agents SDK guardrail model use different surfaces, but both treat permissions as structure, not wording.
- Permission design is not only a security problem. If it is cut poorly, the agent stalls too often. If it is too loose, the blast radius becomes too large. So permissions are also a quality design problem.
1. Why permission design is also an agent quality problem
Permissions are often discussed only in security language. That is reasonable, but incomplete.
In real operations:
- if too much is allowed, one bad action can travel too far
- if too much asks for approval, the workflow keeps breaking
- if too much is denied, the agent spends more effort on workaround behavior than on useful work
So permission design is not only about how safe is the system. It is also about how reliably can the system keep working from start to finish.
That is why modern agent platforms increasingly discuss tool access, sandboxing, approvals, and hooks as one combined surface rather than as isolated toggles.
2. The base frame: allow, ask, and deny are three kinds of responsibility
Many teams read the three buckets too literally:
allow: just let it happenask: ask a human firstdeny: block it
Operationally, those are not just three buttons. They imply different ownership models.
| Bucket | Meaning | Operational responsibility |
|---|---|---|
| allow | safe enough to proceed automatically | can logs and post-checks handle failure |
| ask | needs human judgment in the middle | who approves, with what context, and how often |
| deny | should be structurally blocked | what counts as an exception and how it is recorded |
The key idea is simple:
Askdoes not merely meandangerous. It means the organization does not want the current automation layer making that decision alone.
If that distinction is blurry, ask starts to spread everywhere. Once that happens, approval quality usually collapses.
3. First rule: do not put read, edit, execute, send, and publish in one bucket
Permission models usually become vague when superficially similar actions are grouped together.
All of these can look like tool use:
| Action | Surface appearance | Real risk |
|---|---|---|
| reading local docs | lookup | relatively low |
| editing a draft file | write | medium |
| running tests in a shell | execution | medium to high |
| calling an external API | network action | possible data leakage |
| publishing a blog post | transmission / publication | hard to undo |
| touching credential files | read | very high |
That is why a practical system should usually separate at least:
- read
- local edit
- code or command execution
- external sending
- publication or deployment
- credential or protected asset access
Without that separation, allow becomes too broad, ask becomes noisy, and deny arrives too late.
4. Second rule: use ask only where human judgment adds real value
Many systems quietly fail because ask becomes the lazy default. If something looks risky, the system asks a human and pushes the cost into operations.
But ask is not free:
- it introduces waiting
- it breaks momentum
- it creates inconsistency between reviewers
- it makes important approvals harder to notice if used too often
So ask should be reserved for boundaries where human review is genuinely useful, such as:
- sending content outside the organization
- hard-to-revert writes
- expensive or far-reaching execution
- expanding data access scope
- any action that raises privileges
Low-blast-radius work such as reading docs or editing a narrow draft file usually should not stop the flow.
5. Third rule: deny should be explainable, not merely absolute
Deny looks simple, but it is the strongest policy surface in the system. That is why it is not enough to know what is blocked. It must also be clear why it is blocked.
Typical deny candidates include:
- direct credential access
- external publication without approval
- edits outside the allowed workspace
- outbound network transmission without policy clearance
- direct modification of protected deploy or publish paths
Good deny rules usually have these properties:
- their scope is explicit
- any exception path is separately defined
- violations are logged
- the human conditions for opening an exception are documented
So deny is less like an arbitrary wall and more like an organization's irreversible boundary.
6. What Claude Code and the OpenAI Agents SDK have in common
Anthropic and OpenAI implement permission surfaces differently, but the operating lesson is similar.
Claude Code describes permissions through several layers such as hooks, ask rules, deny rules, allow rules, and tool permission callbacks. That makes one point very clear: permissions are not a single boolean setting.
The OpenAI Agents SDK discusses guardrails at multiple surfaces too, including input/output guardrails and tool-level controls. As workflows become more complex, permission logic has to move closer to the actual action surface.
The shared lesson is this:
- permissions live outside the model
- permissions cannot be separated cleanly from tool design
- permissions usually get more granular as workflows become more operational
7. Five practical questions for cutting a permission model
Abstract principles help, but the most practical questions are usually these:
- If this action fails, is recovery easy?
- Does this action leave an external trace?
- Does it touch protected assets or wide scope?
- Can a human meaningfully make a better judgment here?
- Is a log enough, or does this need prevention before execution?
These questions often lead to a structure like this:
| Result | Default suggestion |
|---|---|
| small blast radius, easy recovery | allow |
| human judgment adds real value | ask |
| irreversible or protected-asset action | deny or strong ask |
This does not replace local policy, but it does reduce the common failure mode of using ask everywhere.
8. Three common failure patterns
8.1 Treating all writes as ask
That may look safe at first, but it creates human bottlenecks quickly. Low-risk draft edits do not deserve the same treatment as production-affecting changes.
8.2 Treating external send and local execution as the same risk
They can both look like execution, but undo cost is very different. External send usually leaves a harder-to-reverse trail.
8.3 Adding deny only after incidents happen
Teams often start wide open and tighten later. For credential access or unapproved publication, that is usually backwards. Those boundaries are better designed up front.
9. A quick health check for your permission model
If two or three of these are true, the model is probably blurring:
- approval prompts appear so often that humans mostly auto-approve them
- the same kind of action is handled differently depending on the day
- external sending and local editing share the same policy rule
- after a policy failure, it is hard to explain why the system allowed it
- the agent's biggest blocker is not model weakness but permission ambiguity
A strong model does not cripple the agent. It creates broad low-risk allow lanes, sharp high-value ask lanes, and explicit irreversible deny boundaries.
10. Conclusion: allow, ask, and deny are not just settings, but operating philosophy
Permission design is not only a safety-team problem. In real agent systems, the permission model becomes part of the quality model.
The deeper questions are:
- how small should failure radius be
- where should human approval be spent
- which boundaries should never be crossed automatically
So allow / ask / deny is not just three configuration values. It is a way for an organization to express how much autonomy and how much responsibility boundary it wants to grant an agent.
Related Internal Links
- In the Managed Agents Era, How Should You Design an Approval Loop?
- Sandboxing Is Not Just a Security Feature. It Is a Quality Structure.
- Agent Evaluation Is Closer to Regression Testing Than to a Scorecard
- In Long-Running Agent Operations, Handoff Design Comes Before Memory
- What a Good Agent Memory Architecture Looks Like
References
- Anthropic, Permissions management
- Anthropic, Hooks reference
- Anthropic, Subagents
- OpenAI Agents SDK, Guardrails
- OpenAI Agents SDK, Tool guardrails
- OpenAI, The next evolution of the Agents SDK,
2026-04-15
Series overview: Series index
๋๊ธ
๋๊ธ ์ฐ๊ธฐ