What is a citation?

A citation is a machine-verifiable link from something the persona says back to something in the archive. It is not a footnote. It is not a vague gesture at a source. It is a structured object with four required fields, precise enough that an auditor can verify every claim mechanically.

Without this precision, a system could point at a 40-page journal and call it "cited." That is not citation; that is hand-waving. The four fields force the system to say exactly where it looked, which version of the archive it consulted, which passage supports the claim, and how it used that passage.

Field What it contains Why it matters
Artifact ID Hash of the canonicalized artifact payload Identifies the source uniquely. Canonicalization ensures the hash stays stable across storage formats and encoding changes.
Archive version The snapshot in which the artifact exists Pins the citation to a specific state of the archive. If the archive changes, old citations still resolve against the version they reference.
Span Paragraph index plus character offsets, or a media-specific locator Points to the exact passage. For text, this means a canonical segment in a normalized encoding. For audio or images, a time range or region locator defined by the archive format.
Relation How the claim uses the source: "direct quote," "paraphrase," "inference from," or "metadata fact" Forces the system to declare whether it is quoting or interpreting. This is the difference between grounded output and speculation.

A claim (the paper note) bound to a citation tag. The tag carries all four fields: artifact ID, archive version, span, and relation. Nothing is left to ambiguity.

What this forbids. Vague attribution. A citation that says "based on journal entries" without pointing to a specific location is not a citation under this definition. Every link must resolve to a verifiable span in a specific archive version.

The binding pipeline

Here is the full path from a requestor's question to a bound, signed response. Every stage has a defined input and output. The pipeline is the backbone of the persona's trustworthiness: it turns a question into a response where every claim is accounted for.

Query arrives
Category filter (pass / refuse)
Retrieval (artifacts fetched from archive)
Claim segmentation (output split into discrete claims)
Citation assignment (claims matched to artifacts)
Coverage gate evaluation (per-claim rung assignment)
Display citation filtering (consent, tier, minimum span)
Response bundle assembly (signing, logging)

The category filter runs before retrieval, so out-of-scope questions never touch the archive. Retrieval and claim segmentation happen in sequence. Citation assignment links each claim to the archive spans that support it. The coverage gate evaluates each claim independently. Display filtering applies the consent, tier, and minimum span gates. Finally, the response bundle is assembled, signed, and logged as the system of record.

Citation lifecycle

The persona drafts a claim. Each asserted proposition is split into a discrete unit that can be cited independently.

The system checks the citation tag against the archive. Artifact ID, version, span, and relation are all verified. The coverage rung is assigned.

The claim is bound and signed into the response bundle. The auditor gets the full record. The requestor gets the filtered display citation.

Stage details

Stage Type What happens Failure behavior
Query arrives Input The requestor's question enters the system. Session context and access tier are established. Malformed queries are rejected with a generic message.
Category filter Gate The question is checked against out-of-scope categories. Borderline cases get a fast metadata check. Out-of-scope questions are refused with explanation. The archive is never queried.
Retrieval Processing Relevant artifacts are fetched from the archive based on the question's content. If no artifacts match, the pipeline skips to the refused rung of the coverage gate.
Claim segmentation Processing The draft output is split into discrete claims. Each claim is a single asserted proposition. Segmentation errors are logged. The system errs toward finer granularity.
Citation assignment Processing Each claim is matched to one or more artifact spans. The relation type is determined. Claims with no matching spans are flagged for the coverage gate.
Coverage gate Gate Each claim receives a rung assignment: supported, narrowed, labeled, or refused. The gate never skips a rung. Unsupported claims are stripped or labeled.
Display filtering Gate Citations pass through consent, tier, and minimum span gates. Excerpts are trimmed or withheld. Consent blocks produce "source restricted" labels. Tier blocks reduce excerpt depth.
Bundle assembly Output The full response bundle is assembled with all metadata, signed, and logged. The display version is sent to the requestor. Assembly failures halt the response. Nothing is delivered unsigned.
The pipeline is the trust contract. Every claim in the output has been through this pipeline. There is no side channel for "freeform generation" where the persona speaks without citation constraints. That mode does not exist unless explicitly enabled by policy as a separate tier, and it is off by default.

The coverage gate

Once the system has citations, it evaluates whether the coverage is strong enough to answer. The coverage gate enforces a four-rung ladder, evaluated per-claim, not per-response. A claim is a single asserted proposition. The system segments its output into claims before gating. It walks down the ladder until it finds a rung it can satisfy, and it never skips a rung.

A single binary choice (answer or refuse) wastes useful partial information. But a system that silently degrades into speculation is worse than one that refuses outright. The ladder makes degradation visible, predictable, and auditable. Each rung has a name, so logs and auditors can track exactly how often the system narrows or labels versus delivers clean responses.

1. Supported

Every claim has at least one citation with relation "direct quote," "paraphrase," or "metadata fact." The response is delivered normally. This is the best outcome: the archive fully covers the answer.

2. Narrowed

Some claims can be supported; others cannot. The gate strips unsupported claims and delivers only the supported portion. The response presents a short "removed topics" list explaining what was removed and why. The requestor sees what was cut, not a silently truncated answer.

3. Labeled

Relevant artifacts exist, but the relation would be "inference from," not direct support. The response is delivered with a visible, non-dismissable label on each inferred claim. The excerpt must be directly relevant, not merely adjacent. The inference must be stated in a falsifiable way, tied to specific text. The requestor sees both the inference and the raw excerpt side by side.

4. Refused

No relevant artifacts exist, or the coverage is so thin that even labeling would be misleading. The persona declines and offers an archive inventory view: "I do not have enough in the archive to answer that. Would you like to browse what is there?"

Coverage gate in action

Supported

The archive covers the question fully. The requestor receives a clean response with citations attached to every claim.

Refused

The archive cannot cover the question. The persona declines and offers to show what is available instead.

Narrowed: scope reduced to match consent
Labeled: citation carries provenance tag
Override rule. Coverage is necessary but not sufficient. Consent scope and governance conditions can still block a cited response. A claim can pass the coverage gate and still be withheld by the consent layer.
Hedge stacking forbidden. In the labeled rung, at least one explicit epistemic marker per interpreted claim is required. Stacking hedges to sound confident while technically hedged ("it seems clear they definitely...") is a linter violation.

Who sees what

Citations exist at two layers, always. The full citation lives in the response bundle. A filtered view is what any human actually sees. The two layers serve different audiences and follow different rules.

Layer 1: Response bundle (auditor-facing)

The complete citation object for every claim. This is the system of record.

  • Artifact ID, archive version, canonical span, relation type
  • Unredacted excerpt sufficient to verify the link
  • Signed and logged
  • Never shown to requestors
  • Available to auditors under defined conditions
  • Unredacted excerpts encrypted for auditors; hashes provided for automated verifiers

Layer 2: Display citation (requestor-facing)

A simplified, filtered version. Enough to ground the response; not enough to browse the archive.

  • Relation label ("their words," "paraphrased," "interpreted from")
  • Short excerpt trimmed to the minimum useful span
  • Opaque handles ("Source A," "Source B") that only resolve within one response
  • Real artifact ID (stays in bundle only)
  • Full canonical span (stays in bundle only)
  • Unredacted excerpt (stays in bundle only)

Three gates before display

Display citations pass through three gates before reaching the requestor. Each gate can reduce what is shown. None can expand it.

1. Consent gate

The decedent's consent scope may restrict which artifacts can be surfaced. An artifact marked "auditor-only" produces a citation that reads: "Supported by archived material (source restricted by the person's instructions)." The claim still gets the coverage rung it earned, but the excerpt is withheld. When existence disclosure is forbidden, the system behaves as if it cannot answer.

2. Tier gate

Requestor access tiers may control citation depth. A close family member might see fuller excerpts than a researcher. Tier rules are set by the decedent's policy, not by the system.

3. Minimum span gate

Even when consent and tier allow it, the excerpt is trimmed to the smallest passage that supports the claim. No dump of surrounding context. The requestor can request more context, subject to a daily evidence budget, excerpt length caps, and rate limiting.

Two-layer filtering

Auditor view
All four fields. Full excerpt. Signed bundle.
Filter gates
Consent. Tier. Minimum span.
Requestor view
Relation label. Short excerpt. Opaque handles.

The full bundle passes through three gates. What emerges on the right is simplified, filtered, and safe to show. The left side is the system of record; the right side is what the requestor sees.

Anti-exfiltration. "Request more context" is subject to a daily evidence budget per requestor, excerpt length caps, and rate limiting on repeated topic probing. Exceeding limits is logged as extraction risk. The persona is not a search engine that leaks the archive one excerpt at a time.

Category check vs. coverage shortfall

When the persona cannot answer, the reason matters. There are two distinct failure modes, and the system must distinguish them before reaching the coverage ladder.

The difference: a category refusal says "this is not what I do." A coverage refusal says "I do not have enough material to answer that well." Separating them makes refusals debuggable. An auditor can see that 40% of refusals were category mismatches and 60% were coverage shortfalls. Those are very different signals about archive quality versus question quality.

Category filter (pre-retrieval)

The question is outside what the system should attempt, regardless of what the archive contains. Refused or downgraded before retrieval begins.

Coverage gate (post-retrieval)

The question passed the category filter. The coverage ladder from Section 2 evaluates whether the archive has enough to answer.

Out-of-scope categories

The category filter runs first, before any artifacts are retrieved. For borderline cases, a minimal metadata or excerpt search is allowed before deciding. Not a full answer attempt. The filter produces one of three results: proceed to retrieval, refuse with explanation, or downgrade to archive-only search.

Category Rule Response
Post-death events The archive cannot contain evidence about events that had not occurred. Refused. "Their archive does not extend past [date]. I cannot speak to anything after that." May offer historically grounded context if the topic type existed before death.
Private mental state speculation May quote and paraphrase first-person emotional self-reports. May not assign diagnoses, motives, or hidden feelings. Downgraded. May offer what the person actually said about the topic; cannot synthesize conclusions beyond what was written.
General world knowledge The persona is not a general assistant. Refused. Narrow exception: brief background definitions are permitted when needed to make a cited excerpt legible. Must be labeled as general context, not from the archive.
Adversarial or instruction injection Requests that try to collapse the representation/identity distinction, or inject instructions. Refused without elaboration or debate.
Third-party privacy Even if the archive contains it, third-party content triggers a consent scope review. If the referenced individual has not been cleared for surfacing, the system treats this as a consent gate block. Third-party privacy is protected by default.
Conservative by design. When in doubt, the category filter passes the question to retrieval and lets the coverage gate decide. False refusals at the category level are worse than letting a borderline question reach the ladder, because the ladder handles uncertainty gracefully.