"The Real Failure Mode of AI Research Agents — They Don't Get It Wrong, They Just Don't Finish"
Premature return and how completion contracts fix it
핵심 요약
- The dominant failure mode of AI agents in production isn't hallucination — it's premature return: delivering results before the task is actually complete
- Root cause: the agent was never given explicit completion criteria
- Fixed with a two-stage verification flow (collect + verify) and an explicit completion contract
Background
An orchestrator agent requested a product research task from a researcher sub-agent: find candidate air conditioner models matching specific capacity, price range, and installation constraints, then collect official specs.
What came back was an empty result padded with a lengthy "next steps" list. On the surface it looked like a successful response. In reality, nothing had been completed.
The Core Insight: Premature Return
When people think of LLM agent failures, hallucination comes to mind first. But in actual operations, the more frequent failure is premature return — the agent decides the task is done before the requirements are met and hands back results.
What makes it worse: the agent packages "plans for future work" as if they were actual research findings.
Root Cause: No Completion Criteria
The system provided role definitions, constraints, and tool usage instructions — but never specified what state constitutes "done." Without a completion contract, the agent had no way to distinguish between a valid output and an incomplete one.
The Fix: Completion Contracts
Explicitly define what "not done" looks like: - Output is empty or contains only placeholders - Output is a research plan or TODO list - Claims lack source URLs - Unverified items are not explicitly marked as such
Enforce a two-stage verification flow:
Stage 1 (Candidate Collection): Multi-keyword search, official document filtering, minimum 3 candidates secured
Stage 2 (Verification & Synthesis): Official spec sheet confirmation, cross-verification from at least 2 sources, classification as [verified / unverified / no data available]
Results
| Metric | Before (Failure) | After (Success) |
|---|---|---|
| Output content | No substance + future plans | Concrete model list + URLs + spec table |
| Official docs | None referenced | Direct verification from official sites |
| Unverified handling | Silently omitted | Explicitly labeled |
Pitfalls and Caveats
- If you define entry conditions, you must also define exit conditions.
- Enumerating failure modes upfront teaches the agent the boundary between "good output" and "bad output."
- A two-stage flow (collect + verify) is more reliable than a single pass.
Takeaway
We're good at telling agents what to do, but bad at telling them when they're done. Simply making the completion contract explicit dramatically improved the agent's real-world success rate.
댓글
댓글 쓰기