The Problem
The request looks standard: produce a short summary of policy changes and add sources.
Traces show something else: in one run the agent returned 7 citations,
but verification showed 3 sources were never fetched, and 2 point to 404.
For the user, the answer looks confident, but it is not reproducible.
The system does not crash.
It just returns plausible citations without real evidence.
Analogy: imagine an auditor who references "folders in the archive" that nobody has ever seen. The report looks professional until someone checks the sources. Hallucinated sources in agent systems work the same way.
Why This Happens
Hallucinated sources usually appear not because of one model mistake, but because citation control in runtime is not strict.
LLM has a strong bias toward "complete" answers, so without strict verification the model is more likely to invent a citation than return an answer without a source.
In production, it is typically this:
- the agent generates citations as part of a "complete" answer;
- search snippets are treated as evidence even though pages were never opened;
source_idvalues are not tied to evidence snapshots;- without citation verification, runtime passes unfetched or invalid sources;
- if fail-closed is not configured, invented sources reach the user.
In trace this appears as growth in citations_count
while citation_validity_rate drops.
The problem is not one bad URL.
Runtime does not block unverified citations before final output.
Most Common Failure Patterns
In production, four recurring hallucinated-source patterns appear most often.
Unfetched URL citations
The agent cites a URL that never went through http.get or kb.read in that run.
Typical cause: citations are not restricted to source_id from evidence store.
Snippet instead of evidence (Search-as-evidence)
The answer includes "sources" from search results, but the agent has no confirmation of actual page content.
Typical cause: search results are mixed with the evidence layer.
Citation drift across steps
At an earlier step the source was valid, but after retry or truncation the final answer references another document.
Typical cause: no stable claim -> source_id -> snapshot hash binding.
Pseudo-citations without claim coverage (Claim-source mismatch)
The answer contains a citation block, but key claims have no supporting source.
Typical cause: validation checks only "presence of links", not claim coverage.
How To Detect These Problems
Hallucinated sources are visible through citation and retrieval metrics together.
| Metric | Hallucinated sources signal | What to do |
|---|---|---|
citation_validity_rate | share of verified citations drops | introduce fail-closed verification by source_id |
unfetched_source_rate | many unfetched URLs in answers | forbid URL citations without evidence snapshot |
source_404_rate | some sources cannot be opened | check response status and canonical URL during fetch |
claim_without_citation_rate | claims are not linked to sources | add claim-level coverage check |
citation_stop_reason_rate | frequent citations:invalid in runtime | review retrieval quality and tool policy |
How To Distinguish Hallucinated Sources From Just An Inaccurate Answer
Not every text inaccuracy means invented sources. The key question: can the source be technically reproduced for each critical claim.
Normal if:
- each citation points to
source_idthat exists in evidence store; - snapshot metadata exists (URL, timestamp, hash);
- claim checks show sources cover key conclusions.
Dangerous if:
- the answer contains URLs that never appeared in fetch step;
- citations are present only "for form" but do not cover main claims;
- answers cannot be reproduced at run level (
run_id->source_id-> snapshot).
How To Stop These Failures
In practice, it looks like this:
- all sources pass through evidence store (snapshot + hash + timestamp);
- the model returns citations only as
source_id, not arbitrary URLs; - citation verifier checks that all
source_idvalues exist, were fetched, and are allowed by policy; - if verification fails, runtime returns stop reason and safe fallback.
Minimal guard for citation validation:
from dataclasses import dataclass
import hashlib
import time
@dataclass(frozen=True)
class EvidenceMeta:
source_id: str
url: str
fetched_at: float
text_sha256: str
class EvidenceStore:
def __init__(self):
self.items: dict[str, EvidenceMeta] = {}
def add_snapshot(self, source_id: str, url: str, text: str) -> None:
self.items[source_id] = EvidenceMeta(
source_id=source_id,
url=url,
fetched_at=time.time(),
text_sha256=hashlib.sha256(text.encode("utf-8")).hexdigest(),
)
def has(self, source_id: str) -> bool:
return source_id in self.items
def verify_citations(cited_source_ids: list[str], store: EvidenceStore) -> str | None:
# cited_source_ids are expected to come from structured output
if not cited_source_ids:
return "citations:missing"
unknown = [sid for sid in cited_source_ids if not store.has(sid)]
if unknown:
return "citations:unknown_source_id"
return None
This is a basic guard.
In production, it is usually extended with claim-level coverage check,
allowlist for citation tools, and a separate stop reason for unfetched URLs.
verify_citations(...) is called before final response rendering,
so user never sees invalid sources.
Where This Is Implemented In Architecture
In production, hallucinated-source control is almost always split across three system layers.
Tool Execution Layer handles evidence fetch: response status, URL normalization, snapshots, and hash. If this layer does not store evidence, citations cannot be verified reliably.
Agent Runtime controls structured output, citation verification, stop reasons, and fail-closed fallback. This is where the final decision is made whether answer can be shown to user.
Memory Layer keeps run-to-evidence linkage:
run_id, source_id, retention, and reproducibility.
Without this layer, teams cannot run a proper incident audit.
Self-check
Quick pre-release check. Tick the items and see the status below.
This is a short sanity check, not a formal audit.
Progress: 0/8
β There are risk signals
Basic controls are missing. Close the key checklist points before release.
FAQ
Q: Can I just ask the model to "always include sources"?
A: You can, but it is not enough. Without runtime citation verification, this is formatting, not evidence.
Q: Can search results be used as evidence?
A: Usually no. Search gives candidate sources only.
Evidence is only what was fetched and stored as a snapshot.
Q: Do I need to store the full source text?
A: Not always. Minimum for audit is URL, timestamp, hash, and stable source_id. Full text is added where replay or exact quotes are needed.
Q: What should user see when citations are invalid?
A: Explicit stop reason, what was already verified, and a safe next step: partial answer without unverified sources, or rerun with verification.
Hallucinated-sources incidents almost never look like a loud crash. It is a silent trust loss that is usually noticed only after source checks. So production agents need not only good answers, but strict citation discipline.
Related Pages
If this happens in production, these pages are also useful:
- Why AI agents fail - general map of production failures.
- Context poisoning - how problematic context pushes agents to wrong conclusions.
- Tool failure - how unstable tools break the evidence chain.
- Agent Runtime - where to enforce structured output verification and stop reasons.
- Tool Execution Layer - where to collect snapshots and validate sources.
- Memory Layer - where to keep evidence reproducibility across runs.