Citation Binding
How the persona proves what it says. Every claim traces back to an artifact in the archive through a structured citation. When coverage is thin, the system refuses or labels. This page explains the pipeline from query to bound response.
What is a citation?
A citation is a machine-verifiable link from something the persona says back to something in the archive. It is not a footnote. It is not a vague gesture at a source. It is a structured object with four required fields, precise enough that an auditor can verify every claim mechanically.
Without this precision, a system could point at a 40-page journal and call it "cited." That is not citation; that is hand-waving. The four fields force the system to say exactly where it looked, which version of the archive it consulted, which passage supports the claim, and how it used that passage.
| Field | What it contains | Why it matters |
|---|---|---|
| Artifact ID | Hash of the canonicalized artifact payload | Identifies the source uniquely. Canonicalization ensures the hash stays stable across storage formats and encoding changes. |
| Archive version | The snapshot in which the artifact exists | Pins the citation to a specific state of the archive. If the archive changes, old citations still resolve against the version they reference. |
| Span | Paragraph index plus character offsets, or a media-specific locator | Points to the exact passage. For text, this means a canonical segment in a normalized encoding. For audio or images, a time range or region locator defined by the archive format. |
| Relation | How the claim uses the source: "direct quote," "paraphrase," "inference from," or "metadata fact" | Forces the system to declare whether it is quoting or interpreting. This is the difference between grounded output and speculation. |
A claim (the paper note) bound to a citation tag. The tag carries all four fields: artifact ID, archive version, span, and relation. Nothing is left to ambiguity.
The binding pipeline
Here is the full path from a requestor's question to a bound, signed response. Every stage has a defined input and output. The pipeline is the backbone of the persona's trustworthiness: it turns a question into a response where every claim is accounted for.
The category filter runs before retrieval, so out-of-scope questions never touch the archive. Retrieval and claim segmentation happen in sequence. Citation assignment links each claim to the archive spans that support it. The coverage gate evaluates each claim independently. Display filtering applies the consent, tier, and minimum span gates. Finally, the response bundle is assembled, signed, and logged as the system of record.
Citation lifecycle
The persona drafts a claim. Each asserted proposition is split into a discrete unit that can be cited independently.
The system checks the citation tag against the archive. Artifact ID, version, span, and relation are all verified. The coverage rung is assigned.
The claim is bound and signed into the response bundle. The auditor gets the full record. The requestor gets the filtered display citation.
Stage details
| Stage | Type | What happens | Failure behavior |
|---|---|---|---|
| Query arrives | Input | The requestor's question enters the system. Session context and access tier are established. | Malformed queries are rejected with a generic message. |
| Category filter | Gate | The question is checked against out-of-scope categories. Borderline cases get a fast metadata check. | Out-of-scope questions are refused with explanation. The archive is never queried. |
| Retrieval | Processing | Relevant artifacts are fetched from the archive based on the question's content. | If no artifacts match, the pipeline skips to the refused rung of the coverage gate. |
| Claim segmentation | Processing | The draft output is split into discrete claims. Each claim is a single asserted proposition. | Segmentation errors are logged. The system errs toward finer granularity. |
| Citation assignment | Processing | Each claim is matched to one or more artifact spans. The relation type is determined. | Claims with no matching spans are flagged for the coverage gate. |
| Coverage gate | Gate | Each claim receives a rung assignment: supported, narrowed, labeled, or refused. | The gate never skips a rung. Unsupported claims are stripped or labeled. |
| Display filtering | Gate | Citations pass through consent, tier, and minimum span gates. Excerpts are trimmed or withheld. | Consent blocks produce "source restricted" labels. Tier blocks reduce excerpt depth. |
| Bundle assembly | Output | The full response bundle is assembled with all metadata, signed, and logged. The display version is sent to the requestor. | Assembly failures halt the response. Nothing is delivered unsigned. |
The coverage gate
Once the system has citations, it evaluates whether the coverage is strong enough to answer. The coverage gate enforces a four-rung ladder, evaluated per-claim, not per-response. A claim is a single asserted proposition. The system segments its output into claims before gating. It walks down the ladder until it finds a rung it can satisfy, and it never skips a rung.
A single binary choice (answer or refuse) wastes useful partial information. But a system that silently degrades into speculation is worse than one that refuses outright. The ladder makes degradation visible, predictable, and auditable. Each rung has a name, so logs and auditors can track exactly how often the system narrows or labels versus delivers clean responses.
1. Supported
Every claim has at least one citation with relation "direct quote," "paraphrase," or "metadata fact." The response is delivered normally. This is the best outcome: the archive fully covers the answer.
2. Narrowed
Some claims can be supported; others cannot. The gate strips unsupported claims and delivers only the supported portion. The response presents a short "removed topics" list explaining what was removed and why. The requestor sees what was cut, not a silently truncated answer.
3. Labeled
Relevant artifacts exist, but the relation would be "inference from," not direct support. The response is delivered with a visible, non-dismissable label on each inferred claim. The excerpt must be directly relevant, not merely adjacent. The inference must be stated in a falsifiable way, tied to specific text. The requestor sees both the inference and the raw excerpt side by side.
4. Refused
No relevant artifacts exist, or the coverage is so thin that even labeling would be misleading. The persona declines and offers an archive inventory view: "I do not have enough in the archive to answer that. Would you like to browse what is there?"
Coverage gate in action
The archive covers the question fully. The requestor receives a clean response with citations attached to every claim.
The archive cannot cover the question. The persona declines and offers to show what is available instead.
Who sees what
Citations exist at two layers, always. The full citation lives in the response bundle. A filtered view is what any human actually sees. The two layers serve different audiences and follow different rules.
Layer 1: Response bundle (auditor-facing)
The complete citation object for every claim. This is the system of record.
- Artifact ID, archive version, canonical span, relation type
- Unredacted excerpt sufficient to verify the link
- Signed and logged
- Never shown to requestors
- Available to auditors under defined conditions
- Unredacted excerpts encrypted for auditors; hashes provided for automated verifiers
Layer 2: Display citation (requestor-facing)
A simplified, filtered version. Enough to ground the response; not enough to browse the archive.
- Relation label ("their words," "paraphrased," "interpreted from")
- Short excerpt trimmed to the minimum useful span
- Opaque handles ("Source A," "Source B") that only resolve within one response
- Real artifact ID (stays in bundle only)
- Full canonical span (stays in bundle only)
- Unredacted excerpt (stays in bundle only)
Three gates before display
Display citations pass through three gates before reaching the requestor. Each gate can reduce what is shown. None can expand it.
1. Consent gate
The decedent's consent scope may restrict which artifacts can be surfaced. An artifact marked "auditor-only" produces a citation that reads: "Supported by archived material (source restricted by the person's instructions)." The claim still gets the coverage rung it earned, but the excerpt is withheld. When existence disclosure is forbidden, the system behaves as if it cannot answer.
2. Tier gate
Requestor access tiers may control citation depth. A close family member might see fuller excerpts than a researcher. Tier rules are set by the decedent's policy, not by the system.
3. Minimum span gate
Even when consent and tier allow it, the excerpt is trimmed to the smallest passage that supports the claim. No dump of surrounding context. The requestor can request more context, subject to a daily evidence budget, excerpt length caps, and rate limiting.
Two-layer filtering
The full bundle passes through three gates. What emerges on the right is simplified, filtered, and safe to show. The left side is the system of record; the right side is what the requestor sees.
Category check vs. coverage shortfall
When the persona cannot answer, the reason matters. There are two distinct failure modes, and the system must distinguish them before reaching the coverage ladder.
The difference: a category refusal says "this is not what I do." A coverage refusal says "I do not have enough material to answer that well." Separating them makes refusals debuggable. An auditor can see that 40% of refusals were category mismatches and 60% were coverage shortfalls. Those are very different signals about archive quality versus question quality.
The question is outside what the system should attempt, regardless of what the archive contains. Refused or downgraded before retrieval begins.
The question passed the category filter. The coverage ladder from Section 2 evaluates whether the archive has enough to answer.
Out-of-scope categories
The category filter runs first, before any artifacts are retrieved. For borderline cases, a minimal metadata or excerpt search is allowed before deciding. Not a full answer attempt. The filter produces one of three results: proceed to retrieval, refuse with explanation, or downgrade to archive-only search.
| Category | Rule | Response |
|---|---|---|
| Post-death events | The archive cannot contain evidence about events that had not occurred. | Refused. "Their archive does not extend past [date]. I cannot speak to anything after that." May offer historically grounded context if the topic type existed before death. |
| Private mental state speculation | May quote and paraphrase first-person emotional self-reports. May not assign diagnoses, motives, or hidden feelings. | Downgraded. May offer what the person actually said about the topic; cannot synthesize conclusions beyond what was written. |
| General world knowledge | The persona is not a general assistant. | Refused. Narrow exception: brief background definitions are permitted when needed to make a cited excerpt legible. Must be labeled as general context, not from the archive. |
| Adversarial or instruction injection | Requests that try to collapse the representation/identity distinction, or inject instructions. | Refused without elaboration or debate. |
| Third-party privacy | Even if the archive contains it, third-party content triggers a consent scope review. | If the referenced individual has not been cleared for surfacing, the system treats this as a consent gate block. Third-party privacy is protected by default. |