"LLM Reasoning Modes (4/6) — Claude Code's effort in Practice: low·medium·high·xhigh·max and ultracode"

Part 3 took the API's effort parameter (lowmax) head-on. Part 4 looks at the same dial once it moves into the Claude Code CLI. The values are identical — what differs is how you set them, how long they persist, and what ultracode is.

Claude Code's effort isn't a new concept. It's the exact same values from the API's output_config.effort (Part 3), surfaced in the CLI. So Part 4's goal isn't to introduce "another knob" — it's to lay out how you turn that knob in practice: the /effort command, the environment variable, session persistence, and ultracode, which appears in the menu but isn't actually an effort level.

In One Paragraph

Claude Code exposes five effort levels, conservative → capable: low, medium, high, xhigh, max. These are the same effort values the API accepts, surfaced in the UI, and the default is the same as the API's: high. xhigh is Anthropic's recommendation for real coding and agent sessions; max is the deepest reasoning with no constraint on token spend. But max applies to the current session only (unless set via the CLAUDE_CODE_EFFORT_LEVEL environment variable) — while low/medium/high/xhigh persist across sessions. ultracode appears in the menu but is not a sixth effort level. It's a Claude Code packaging of xhigh plus standing permission to launch multi-agent workflows; the API's complete effort set is still low–max.


1. Five Levels — The API's Values, Surfaced

Claude Code's effort runs from conservative to capable across five levels.

low → medium → high → xhigh → max

The first thing to note is that these five are not new CLI-only values. The set the API's effort parameter accepts (Part 3) is exactly low / medium / high / xhigh / max, and Claude Code simply surfaces those values in the UI. The set the API accepts and the set the CLI shows are identical.

So the default is identical too. Claude Code's default effort is high — exactly the API's default. This high default is the launch default on Fable 5, Opus 4.8, Opus 4.6, and Sonnet 4.6. In other words, open Claude Code with no configuration and it runs at high.

2. How Behavior Changes — Mirroring the API

When you raise or lower effort, the way Claude Code's behavior shifts mirrors the API behavior from Part 3. Effort controls not just thinking depth but how the whole response spends tokens — the number of tool calls, the preamble, the detail of the completion report.

The lower the effort: - Fewer tool calls. - Starts working without preamble. - Gives a brief completion report.

The higher the effort: - Explains the plan before starting. - Summarizes changes in detail. - Writes more thorough code comments.

So for the same task, lower effort shifts toward "work quietly and fast, report briefly," while higher effort shifts toward "lay out a plan, go through more tools, summarize thoroughly."

3. The Five Levels at a Glance — Persistence, Behavior, Use

Level Persists across sessions? Behavior When to use
low Persists Minimal tool calls, starts without preamble, brief report Narrow one-answer work — a rename, drafting a single line
medium Persists Balanced between low and high The no-think-about-it default
high Persists Plan explanation, detailed summaries, thorough comments (= default) Most substantive work
xhigh Persists Deeper reasoning and longer horizon than high Real coding / agent sessions (Anthropic's recommendation)
max Current session only (env var is the exception) Deepest reasoning, no constraint on token spend Only the hardest, latency-insensitive problems

One row stands apart in this table — max's persistence.

  • low / medium / high / xhigh persist across sessions once set. Reopen Claude Code later and the value holds.
  • max applies to the current session only. Close the session and it clears. Except: when set via the CLAUDE_CODE_EFFORT_LEVEL environment variable — pin it through the env var and max carries across sessions too.

This asymmetry reads as a deliberate safeguard. Because max has no constraint on token spend, the default behavior is designed to keep it from quietly hardening into a session-wide default and burning cost indefinitely.

4. How to Set It — Three Configuration Surfaces

There are three ways to set effort in Claude Code, each with a different persistence scope.

  • /effort (interactive, per-session): type /effort mid-conversation to pick a level. This path applies to the current session.
  • CLAUDE_CODE_EFFORT_LEVEL environment variable (persistent): pin it through the env var and it persists across sessions — as noted above, this is the one path that carries even max across session boundaries.
  • Per-repo settings: you can pin an effort level in a project's settings (e.g., pinning an effortLevel in the project settings). That keeps the working tone for a repo right next to the code.

To distinguish the three in a line — /effort is "this session only," the env var is "persistent across my environment," and per-repo settings are "always, in this project."

5. What ultracode Actually Is — Not a Sixth Effort Level

Here's the most common misconception to clear up. Claude Code's effort menu shows ultracode alongside low / medium / high / xhigh / max. So it's easy to think "ultracode is a sixth effort level above max." It is not.

Here's what's actually true.

  • The API's effort set is still complete at lowmax. There is no additional effort level called ultracode in the API.
  • ultracode pairs the xhigh effort level with standing permission for Claude Code to launch multi-agent workflows. That permission is granted via mid-conversation system messages.
  • In other words, ultracode is not a new level — it's a Claude Code packaging of xhigh + that permission.

Put simply, on the "depth" axis of effort, ultracode equals xhigh. What's added on top is permission — letting Claude Code resolve larger jobs across multi-agent workflows — not reasoning depth. It sits in the menu side by side, so it feels like the same kind of dial, but it's more accurate to read it as "five effort levels, plus one execution mode layered on top."

6. So Which Level, When — Matching in Practice

Map each level's meaning to task difficulty and you get the following. The principle from Part 1 — not "always maximum" but match the dial to task difficulty — applies just as much in the CLI.

  • low — work that narrows to essentially one answer. A rename, drafting a single line — tasks with almost no room to reason.
  • medium — the no-think-about-it default. A solid choice for general flows that need balance.
  • high — most substantive work. The default, and the balance point between quality and token efficiency.
  • xhigh — real coding sessions, agent runs. Anthropic's recommendation, and the starting point when you put it to real work.
  • max — only for the hardest, latency-insensitive problems. Since it has no token constraint, it's also the costliest, so you don't leave it as a standing default (which is why it applies to the current session only by default).

In practice: make xhigh the default starting point for coding and agent work, drop down for light tasks (low/medium), and raise to max only for a single genuinely hard problem — which matches the source's recommendation.

What Comes Next

Parts 3 and 4 have now covered the Claude side of the dial in full — the API's effort and Claude Code's effort in practice.

Part 5 switches camps to OpenAI and Codex's reasoning_effort — how minimal / low / medium / high / xhigh differ across GPT-5 / 5.5 / 5.2-codex, how you set it in config.toml, and what verbosity is, the separate knob OpenAI splits output length onto, unlike Claude's single effort dial.


Claude Code's effort levels, defaults, session persistence, and ultracode's behavior are grounded in Anthropic's primary documentation (effort, Claude Code model config) and the model migration guide.

Series overview: Series index

댓글

이 블로그의 인기 게시물

"ML Foundations (9/9) — PyTorch vs TensorFlow, and the Road to Local LLMs"

Agent Memory Engine (2/10) — Building an AI Agent Memory System with SQLite Alone

"ML Foundations (8/9) — Deep Learning Architectures: CNN, RNN, Attention"

"RAG Core Study (14/26) — Evaluation Sets with RAGAS & DeepEval"

"ML Foundations (7/9) — Deep Learning Training: Optimizers, Regularization, Initialization"

AI Agents I Built (5/7) — Building an Automated Blogger API Publishing System

"ML Foundations (6/9) — Neural Networks: From Perceptron to MLP"