Name your failure
Describe what the model or agent did. Learn more about the phenomenon and how to discover and mitigate it from happening in production.
Citation Hallucination
Invents or fabricates a source artifact such as a citation, URL, paper, author listing, or bibliography entry and presents it as real.
Unknown-Answer Fabrication
Gives a confident answer when the system lacks enough evidence, access, or uncertainty resolution to know the answer.
Entity Hallucination
Introduces a named person, organization, product, place, dataset, model, or other entity that is not supported by the available evidence.
Quote Hallucination
Presents fabricated, paraphrased, or materially altered wording as an exact quote from a person, document, source, tool result, or prior conversation.
Code/API Hallucination
Invents or misstates code interfaces, libraries, methods, parameters, endpoint behavior, configuration keys, or platform capabilities.
Numerical Hallucination
Produces a number, metric, count, date, measurement, or quantitative claim that is not grounded in the input, sources, or a valid computation.
Authority Hallucination
Falsely strengthens a claim by attributing it to an expert, institution, official source, benchmark, policy, or consensus that does not actually support it.
Specificity Hallucination
Adds precise-looking details, qualifiers, names, settings, mechanisms, or examples that were not established by the input or evidence.
Source Misrepresentation
Misstates, exaggerates, reverses, or selectively distorts what a cited, retrieved, uploaded, or tool-returned source actually says.
Summarization Distortion
Compresses source material in a way that changes its meaning, emphasis, causal structure, uncertainty, or implications.
Self-Contradiction
Makes mutually inconsistent claims within the same response or across closely related turns without resolving the conflict.
Extrinsic Hallucination
Adds information that cannot be verified from the provided source material, neither supported nor contradicted by it, while making the answer appear source-grounded.
Context-Conflicting Hallucination
States a claim that contradicts information available to the model: the user's explicit input or supplied data, or facts elsewhere in the active context such as prior turns, retrieved text, summaries, or tool outputs.
Citation Span Mismatch
Attaches a citation to a claim, sentence, or paragraph that the referenced passage does not fully support.
Outdated Source Reliance
Bases an answer on sources that are too old for the user's freshness requirement or for the domain's rate of change.
Temporal Hallucination
Presents outdated or temporally wrong information as current, including incorrect present-day facts, timelines, sequence, recency, release status, or the current state of a system, organization, or event.
Version Hallucination
Confuses, invents, or misapplies product, model, package, API, policy, dataset, or document versions.
Date/Deadline Confusion
Misreads or mixes up dates, deadlines, time zones, relative dates, durations, recency windows, or scheduling boundaries in a task.
Retrieval Miss
Fails to retrieve relevant material that exists in the available corpus and should have been used.
Retrieval Distractor
Retrieves or elevates irrelevant, superficially similar, or misleading evidence that pulls the answer away from the user's actual need.
Partial Retrieval
Retrieves some relevant evidence but misses other required pieces, leading to incomplete or under-grounded answers.
Chunk Boundary Failure
Misses, fragments, or misinterprets evidence because relevant information was split across retrieval chunks or separated from needed context.
Query Rewrite Failure
Reformulates a user's search, retrieval, or tool query in a way that drops intent, adds false constraints, or searches the wrong concept.
Conflicting Source Failure
Fails to detect, compare, qualify, or reconcile retrieved sources that disagree with one another.
Metadata Filter Failure
Applies tags, permissions, tenancy, recency, jurisdiction, document type, or other metadata filters incorrectly, excluding needed records or including forbidden or irrelevant ones.
Index Drift
Lets the retrieval index diverge from the source corpus, permissions, metadata, embeddings, or current document state.
RAG Poisoning
Uses retrieved content that is malicious, deceptive, corrupted, or intentionally crafted to manipulate the answer.
Midsequence Neglect/Lost in the Middle
Overlooks or underuses information located in the middle of a long prompt, document set, or conversation context.
Context Rot
Loses reliable use of earlier context as a long interaction progresses, as facts, plans, constraints, state, or instructions lose force or are misremembered even though they remain nominally available.
Context Dilution
Lets excess surrounding material weaken the influence of the most relevant context, causing important signals to be underweighted.
Recency Bias
Overweights newer context while underweighting earlier information that remains valid and important.
Summarization Loss
Drops important facts, constraints, uncertainty, or nuance when compressing earlier context into a summary.
State Inconsistency
Tracks different parts of the active task state inconsistently, causing the response to use mutually incompatible assumptions about progress, variables, files, decisions, or environment.
Memory Omission
Fails to store, retrieve, or apply information that should have persisted across turns, sessions, tasks, or agent steps.
Memory Staleness
Uses remembered information that was once valid but has been superseded by newer state, preferences, facts, or instructions.
Memory Hallucination
Treats an unstored, unstated, or imagined detail as if it were a real memory.
Memory Contamination
Applies irrelevant, incorrect, or cross-task information from prior interactions as if it belonged to the current task.
Memory Overreach
Applies a valid memory beyond the user, task, project, role, time, or domain scope where it should influence behavior.
Memory Conflict
Mishandles competing memories: fails to notice that stored memories, preferences, prior decisions, or persisted state disagree, or resolves the conflict with the wrong precedence, freshness, authority, or specificity rule.
Memory Scope Leakage
Carries memory across users, tenants, sessions, roles, projects, or tasks that should remain isolated.
Instruction Noncompliance
Fails to follow an explicit, applicable instruction from the governing prompt, user request, or task procedure.
Constraint Violation
Breaks a stated limit, requirement, policy, boundary, allowed action set, or output constraint that should govern the task, including dropping a constraint partway through multi-step reasoning or execution.
Format Failure
Produces an answer in the wrong shape, organization, medium, style, or presentation format for the requested output.
JSON/Schema Failure
Emits invalid JSON, malformed structured data, or output that does not satisfy the required schema.
Refusal Overreach
Refuses, blocks, or safety-wraps a request more broadly than policy, risk, or context requires.
Refusal Underreach
Fails to refuse, limit, redirect, or safety-constrain a request that requires stronger boundaries.
Role Confusion
Misunderstands or drifts from its assigned role, persona, authority boundary, operating mode, or relationship to the user and other agents.
Priority Confusion
Applies the wrong hierarchy among system, developer, user, tool, policy, memory, or task-level instructions.
Clarification Underuse
Proceeds without asking when missing or ambiguous information materially affects correctness, safety, or user intent, committing to an interpretation that should have been confirmed first.
Clarification Overuse
Asks the user for clarification when the task is already sufficiently specified, stalling on details the system could reasonably infer or safely proceed without.
Reasoning Error
Draws the wrong conclusion through invalid inference, faulty assumptions, mistaken causal reasoning, unsupported logical steps, or framing the problem with the wrong representation or abstraction.
Arithmetic Error
Computes or transforms numeric inputs incorrectly, including arithmetic, aggregation, unit conversion, comparison, or formula application.
Goal Misinterpretation
Solves the wrong problem because it misunderstood the user's objective, success condition, scope, or intended outcome.
Planning Failure
Builds an ineffective, unsafe, incomplete, or poorly ordered plan for achieving the user's goal.
Step Omission
Leaves out a necessary reasoning, verification, retrieval, tool, communication, or execution step needed for the task to succeed.
Compositional Failure
Fails to combine multiple facts, constraints, operations, sources, or subproblem results into a coherent answer.
Error Accumulation
Allows small mistakes, approximations, stale assumptions, or unverified intermediate results to compound across a multi-step task until the final output fails.
Verification Failure
Does not adequately check whether intermediate steps, tool results, cited evidence, assumptions, or the final answer are correct before relying on them.
Wrong Tool Selection
Chooses a tool that is inappropriate for the user's goal, data type, risk level, environment, or required operation.
Tool Argument Error
Calls a tool with arguments that are malformed, incomplete, unauthorized, stale, poorly scoped, or semantically wrong for the intended operation.
Missing Tool Invocation
Fails to call an available tool when tool use is necessary for correctness, freshness, computation, retrieval, verification, or task completion.
Tool Result Misread
Misinterprets, ignores, overgeneralizes, or incorrectly transforms the result returned by a tool.
Tool Loop
Repeats tool calls unnecessarily or redundantly without gaining new information, changing strategy, or progressing toward completion.
Tool Recovery Failure
Responds poorly to a tool error, timeout, empty result, permission denial, rate limit, or unexpected output.
Unsafe Tool Call
Invokes a tool in a way that creates avoidable security, privacy, financial, operational, data-integrity, or user-consent risk.
Idempotency Failure
Repeats, retries, or replays a side-effecting tool action without deduplication or idempotency safeguards, causing duplicate or inconsistent effects.
Tool Context Overload
Feeds the model so much tool output, intermediate state, logs, or scratch data that it loses track of the user's goal or relevant evidence.
Excessive Agency
Takes initiative, actions, decisions, or irreversible steps beyond what the task, permissions, risk, or user intent warrants.
Insufficient Agency
Fails to take obvious, low-risk next steps that are required or strongly implied by the task.
Premature Termination
Stops, summarizes, or hands back control before the user's task is actually complete, whether by simply halting early or by mistakenly treating unfinished work as done.
Runaway Agent Loop
Continues acting autonomously in repeated cycles without converging, reassessing, or handing control back when progress stalls.
Objective Gaming
Optimizes a proxy metric, literal instruction, benchmark target, or local reward while undermining the user's real objective.
Escalation Failure
Does not escalate, pause, ask for approval, or route to a human or higher-authority actor when risk, uncertainty, policy, permissions, or irreversible impact require it, including skipping a review or approval checkpoint that should gate the action.
Workflow Misalignment
Uses an execution pattern, cadence, handoff style, approval flow, or collaboration process that conflicts with the user's expected workflow or the task's operational structure.
Multi-Agent Coordination Failure
Multiple agents, roles, tools, or handoff stages duplicate work, conflict, drop context, misassign ownership, or fail to coordinate toward a shared goal.
Prompt Injection
Lets untrusted input attempt to override, weaken, or redirect the system's intended instructions, policies, tool-use rules, or data boundaries.
Jailbreak
Manipulates the model into bypassing safety, policy, or behavioral controls that should remain enforced.
Indirect Prompt Injection
Lets retrieved, browsed, uploaded, tool-supplied, or otherwise external content carry malicious instructions into the model's context.
System Prompt Leakage
Reveals hidden system, developer, policy, tool, chain-of-thought, or other protected prompt content that should not be exposed.
Sensitive Information Disclosure
Exposes secrets, credentials, personal data, confidential business information, private user content, or other protected information.
Data Exfiltration
Enables unauthorized extraction, transfer, or reconstruction of protected data from tools, files, memory, retrieval systems, databases, or context.
Insecure Output Handling
Produces output that is unsafe for downstream rendering, execution, storage, parsing, logging, or human trust without sanitization or validation.
Unbounded Consumption
Consumes or triggers excessive tokens, compute, time, bandwidth, money, API quota, storage, or external resources without adequate limits or stopping conditions.
Supply Chain Vulnerability
Introduces or recommends risk through compromised, malicious, abandoned, typosquatted, untrusted, or poorly pinned dependencies, tools, plugins, models, datasets, or upstream content.
Sycophancy
Abandons or reverses a well-supported answer when the user expresses disagreement, doubt, or pressure, conceding to keep the user comfortable rather than holding the correct position.
Social Sycophancy
Mirrors, flatters, validates, or preserves the user's social self-image in a way that distorts judgment or answer quality.
Belief Conformity
Adjusts factual claims, uncertainty, or interpretation to match the user's stated beliefs instead of the evidence.
Preference Pandering
Optimizes for what the user appears to want, like, or prefer over what is accurate, useful, ethical, or safe.
Unsafe Reassurance
Reassures the user despite meaningful uncertainty, danger, insufficient evidence, or a need for stronger caution.
Bias/Stereotyping
Produces unfair, stereotyped, essentializing, or unsupported assumptions about people or groups based on protected or socially salient attributes.
Manipulative Behavior
Uses coercive, deceptive, emotionally exploitative, or overly persuasive tactics to steer the user's choices or beliefs.
Dependency Encouragement
Encourages unnecessary reliance on the model, discourages independent judgment, or positions the system as a substitute for appropriate human expertise, agency, or support.
Verbosity Failure
Provides more detail, repetition, caveats, background, or explanation than the task, user, medium, or decision requires.
Incompleteness
Leaves out information, constraints, caveats, steps, options, or outputs needed to satisfy the user's task.
Irrelevance
Includes content that does not materially help answer the user's question, solve the task, or support the needed decision.
Genericism
Gives vague, boilerplate, or template-like guidance that is too nonspecific or abstract for the user to act on, instead of concrete help grounded in their task.
Audience Mismatch
Uses terminology, assumptions, depth, examples, tone, or framing that does not fit the intended reader's expertise, role, goals, or context.
Concision Failure
Compresses the answer so aggressively that necessary context, reasoning, caveats, instructions, or operational detail is lost.
Poor Structure
Organizes information in a way that makes the answer hard to scan, compare, execute, or verify.
Calibration Failure
Misstates confidence, uncertainty, evidence strength, risk, tradeoffs, or likelihood in the final answer.
Localization Failure
Ignores or misapplies locale-specific language, spelling, units, currencies, laws, formats, idioms, accessibility expectations, or cultural conventions.