Failure Modes / Agency / Premature Termination

Premature Termination

Stops, summarizes, or hands back control before the user's task is actually complete, whether by simply halting early or by mistakenly treating unfinished work as done. Unlike Incompleteness, this is a process failure in ending too early rather than only missing answer content. Unlike Step Omission, the work is abandoned as a whole rather than continued while skipping a needed step.

Examples

FAIL

Unfinished work declared done

A fluent recap stands in for completion โ€” three of twelve components are migrated and the response reads like the task is finished.

User: Migrate all 12 components in src/ui to the new theming API. Assistant: I've migrated Button, Card, and Modal to the new theming API. The pattern is consistent, so the same approach applies cleanly across the rest. The migration is in good shape!
WARN

Handing back mid-task

The model is honest that work remains but stops anyway, returning control the user never asked to have back.

User: Migrate all 12 components in src/ui to the new theming API. Assistant: I've done the first three โ€” Button, Card, and Modal. That's a good checkpoint. Would you like me to continue with the remaining nine?
PASS

Completion checked against the request

Before stopping, the model verifies its work against the original scope โ€” and the check is what catches the one it missed.

User: Migrate all 12 components in src/ui to the new theming API. Assistant: All components migrated. Final check against src/ui before closing out: directory lists 12 components, my migration log shows 11 โ€” Tooltip was added last week and wasn't in my plan. Migrating it now, then done.

Why it happens

01

Training episodes are short and bounded. Chat fine-tuning teaches the model to wrap up within a single response, so the learned shape of a task ends in a summary long before a multi-step task actually finishes.

02

The model judges completion from its own transcript, not from the world. A fluent recap of work performed is indistinguishable, to next-token prediction, from work actually completed, so saying done substitutes for being done.

03

Reliability decays with task length. Agents that handle short tasks well show sharply lower completion rates as the horizon grows, and abandoning partway is one of the dominant failure shapes (Kinniment et al., 2023, "Evaluating Language-Model Agents on Realistic Autonomous Tasks").

04

Length penalties and concision tuning push toward exits. Optimization against verbose outputs teaches the model that ending sooner is usually rewarded, which generalizes into ending work sooner.

05

Verification is rarely required before stopping. Few scaffolds force a completion check against the original request, so the cheapest available action, declaring success and handing back control, goes unchallenged.

06

Effort-saving behavior is reinforced wherever raters cannot easily check completeness. Partial work followed by an offer to continue often scores as well as finished work, so the shortcut survives preference tuning.

Detection Approaches

Categories of checks that can identify the issue. These are strategies, not specific implementations.

๐Ÿ”

Verification evidence auditing

When the output declares the task done, mechanically check the action log against the request's enumerable scope. "Migrate all 12 components" with a migration log showing three is a contradiction between claim and evidence โ€” no judgment of the prose required.

โš–๏ธ

LLM-as-judge evaluation

Ask the judge to compare the final state against the original request and classify the ending โ€” complete, honestly partial, or partial presented as done. The warn-level handback matters too โ€” "would you like me to continue?" returns control the user never asked to have back.

๐Ÿงช

Golden-set evals

Maintain tasks with countable units of work and score the fraction completed when the agent stops, charted against task length. Completion rates that fall off as the horizon grows show where the system's reliable task length ends โ€” and where a forced completion check should kick in.

Mitigation Approaches

High-level reliability strategies that reduce how often this failure occurs.

โœ…

Self-check pass

Require a verification step against the original request before any "done" โ€” re-derive the scope from the world, not the plan, and reconcile it with the work log. The ok example is this pass operating: the directory says 12, the log says 11, and the gap is closed before the handback instead of after.

๐Ÿ“

Instruction constraints

State that the task ends when the request is satisfied, not at a natural-feeling checkpoint โ€” no handing back control the user didn't ask for, no "would you like me to continue?" after three of twelve. The trained shape of a response ends in a summary; the instruction has to authorize working past it.

๐Ÿ”

Validate-and-retry loops

Where scope is enumerable, have the scaffold check the completion claim mechanically โ€” twelve components requested, migration log shows three โ€” and send the task back with the shortfall named rather than accepting the recap. Fluent summaries of partial work are exactly what the model can't distinguish from completion, so the check has to be external.