- Published on
RAG Citation Grounding — Making LLMs Cite Their Sources
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
LLMs hallucinate. They confidently state facts not present in your context. Adding proper citations forces the model to ground responses in retrieved documents, and citation validation lets you detect when it fails.
This post covers citation formats, automated validation, and user-facing citation UI patterns.
- Citation Prompting Techniques
- Inline Citation Format
- Citation Validation
- Hallucination Detection via Citation Checking
- NLI-Based Faithfulness Scoring
- Confidence-Weighted Responses
- Source Ranking by Reliability
- Checklist
- Conclusion
Citation Prompting Techniques
Force citations through explicit prompting:
interface Citation {
sourceId: string; // Chunk ID
sourceText: string; // The relevant quote
claimIndex: number; // Which claim in the answer
confidence: number; // 0-1, LLM confidence
}
interface AnswerWithCitations {
answer: string;
citations: Citation[];
hasAllCited: boolean; // Are all claims cited?
citationCoverage: number; // 0-1, % of answer that's cited
}
async function generateAnswerWithCitations(
query: string,
context: Array<{ id: string; text: string }>,
llm: LLMClient
): Promise<AnswerWithCitations> {
const citationPrompt = `
You are answering based on these source documents. You MUST cite every claim.
Sources:
${context.map((c, i) => `[${i}] ${c.text}`).join('\n\n')}
Question: "${query}"
Rules:
1. Answer the question using ONLY information from sources
2. After each claim, cite the source: [SOURCE_${index}]
3. Format: "Claim text [SOURCE_0] More text [SOURCE_1]"
4. If information isn't in sources, say "I don't have information about..."
5. Do not make claims without citations
Answer:`;
const response = await llm.generate({
messages: [{ role: 'user', content: citationPrompt }],
maxTokens: 500,
});
// Parse citations from answer
const answer = response.text;
const citationRegex = /\[SOURCE_(\d+)\]/g;
let match;
const citations: Citation[] = [];
while ((match = citationRegex.exec(answer)) !== null) {
const sourceIdx = parseInt(match[1]);
const source = context[sourceIdx];
// Find the claim associated with this citation
const beforeCitation = answer.substring(0, match.index);
const claimStart = Math.max(
beforeCitation.lastIndexOf('.') + 1,
beforeCitation.lastIndexOf('!') + 1,
beforeCitation.lastIndexOf('?') + 1,
0
);
citations.push({
sourceId: source.id,
sourceText: source.text.substring(0, 150),
claimIndex: citations.length,
confidence: 0.95, // Higher when forced by prompt
});
}
// Calculate citation coverage
const cleanAnswer = answer.replace(/\[SOURCE_\d+\]/g, '');
const citedText = citations.length > 0 ? answer.length : 0;
const citationCoverage = cleanAnswer.length > 0 ? citedText / answer.length : 0;
return {
answer: cleanAnswer,
citations,
hasAllCited: citationRegex.lastIndex === 0 || citations.length > 0,
citationCoverage,
};
}
Inline Citation Format
Format citations for user display:
interface InlineCitation {
id: string;
text: string; // Quoted text from source
sourceId: string;
sourceTitle?: string;
sourceUrl?: string;
}
function formatAnswerWithInlineCitations(
answer: string,
citations: Citation[],
sources: Map<string, { title?: string; url?: string }>
): {
formattedAnswer: string;
citationLinks: InlineCitation[];
} {
let formattedAnswer = answer;
const citationLinks: InlineCitation[] = [];
let citationCounter = 1;
// Replace citations with formatted links
for (const citation of citations) {
const sourceInfo = sources.get(citation.sourceId);
const citationLink = `<citation id="cite-${citationCounter}">[${citationCounter}]</citation>`;
// Insert citation after relevant text
formattedAnswer = formattedAnswer.replace(
new RegExp(`\\[${citation.sourceId}\\]`, 'g'),
citationLink
);
citationLinks.push({
id: `cite-${citationCounter}`,
text: citation.sourceText,
sourceId: citation.sourceId,
sourceTitle: sourceInfo?.title,
sourceUrl: sourceInfo?.url,
});
citationCounter++;
}
return {
formattedAnswer,
citationLinks,
};
}
// HTML output with popover citations
function renderAnswerHTML(
answer: string,
citationLinks: InlineCitation[]
): string {
let html = `<div class="answer">${answer}</div>`;
html += '<div class="citations">';
for (const citation of citationLinks) {
html += `
<div class="citation-item">
<a href="${citation.sourceUrl || '#'}" data-source="${citation.sourceId}">
${citation.id}: ${citation.sourceTitle || 'Source'}
</a>
<blockquote>${citation.text}</blockquote>
</div>`;
}
html += '</div>';
return html;
}
Citation Validation
Verify that citations actually support the claims:
interface ValidationResult {
isValid: boolean;
confidence: number; // 0-1
reasoning: string;
claimText: string;
supportingText: string;
}
async function validateCitation(
claim: string,
citedText: string,
llm: LLMClient
): Promise<ValidationResult> {
const validationPrompt = `
Does this cited text actually support the claim?
Claim: "${claim}"
Cited text: "${citedText}"
Evaluate:
1. Is the cited text relevant to the claim?
2. Does it actually support or contradict the claim?
3. Is there important context missing?
Respond with JSON:
{
"isValid": boolean,
"confidence": 0.0 to 1.0,
"reasoning": "explanation",
"supports": true|false|"partially"
}`;
const response = await llm.generate({
messages: [{ role: 'user', content: validationPrompt }],
maxTokens: 200,
});
const parsed = JSON.parse(response.text);
return {
isValid: parsed.supports === true || parsed.supports === 'partially',
confidence: parsed.confidence,
reasoning: parsed.reasoning,
claimText: claim,
supportingText: citedText,
};
}
// Batch validation of all citations
async function validateAllCitations(
answer: AnswerWithCitations,
llm: LLMClient
): Promise<{
validCitations: number;
invalidCitations: number;
overallFaithfulness: number;
}> {
// Extract claims from answer
const sentences = answer.answer.match(/[^.!?]+[.!?]+/g) || [];
const validations = await Promise.all(
sentences.map(async sentence => {
// Find citations for this sentence
const relevantCitations = answer.citations.filter(c =>
c.claimIndex === sentences.indexOf(sentence)
);
if (relevantCitations.length === 0) {
// Uncited claim
return { isValid: false, confidence: 0 };
}
const results = await Promise.all(
relevantCitations.map(c => validateCitation(sentence.trim(), c.sourceText, llm))
);
return {
isValid: results.some(r => r.isValid),
confidence: results.reduce((sum, r) => sum + r.confidence, 0) / results.length,
};
})
);
const validCount = validations.filter(v => v.isValid).length;
const invalidCount = validations.filter(v => !v.isValid).length;
return {
validCitations: validCount,
invalidCitations: invalidCount,
overallFaithfulness: validCount / (validCount + invalidCount),
};
}
Hallucination Detection via Citation Checking
Flag likely hallucinations:
interface HallucinationAssessment {
hasHallucinations: boolean;
hallucinations: Array<{
text: string;
type: 'unsupported' | 'contradicted' | 'speculative';
severity: 'low' | 'medium' | 'high';
}>;
confidence: number;
}
async function detectHallucinations(
answer: string,
retrievedContext: string[],
llm: LLMClient
): Promise<HallucinationAssessment> {
const detectionPrompt = `
Check this answer against the provided context for hallucinations or
unsupported claims.
Answer: "${answer}"
Context:
${retrievedContext.join('\n---\n')}
Find any:
1. Claims not supported by context
2. Claims contradicted by context
3. Speculative language presented as fact
Respond with JSON:
{
"hallucinations": [
{
"text": "the unsupported claim",
"type": "unsupported|contradicted|speculative",
"severity": "low|medium|high"
}
]
}`;
const response = await llm.generate({
messages: [{ role: 'user', content: detectionPrompt }],
maxTokens: 300,
});
const parsed = JSON.parse(response.text);
return {
hasHallucinations: parsed.hallucinations.length > 0,
hallucinations: parsed.hallucinations,
confidence: 0.85, // LLM-based detection confidence
};
}
// Prevent hallucinations at generation time
async function generateWithHallucinationPrevention(
query: string,
context: Array<{ id: string; text: string }>,
llm: LLMClient
): Promise<AnswerWithCitations> {
const preventionPrompt = `
CRITICAL RULES:
1. ONLY use information from the provided sources
2. If asked about something not in sources, explicitly say so
3. Do NOT speculate, interpolate, or use external knowledge
4. Every factual claim MUST be cited [SOURCE_X]
5. Use phrases like "The sources do not mention..." when needed
Sources:
${context.map((c, i) => `[${i}] ${c.text}`).join('\n\n')}
Question: "${query}"
Answer (following rules strictly):`;
const response = await llm.generate({
messages: [{ role: 'user', content: preventionPrompt }],
maxTokens: 400,
});
// Parse and validate
const answer = response.text;
const citations = extractCitations(answer, context);
return {
answer: removeCitationMarkers(answer),
citations,
hasAllCited: citations.length > 0,
citationCoverage: calculateCoverage(answer, citations),
};
}
function extractCitations(
answer: string,
context: Array<{ id: string; text: string }>
): Citation[] {
const citationRegex = /\[SOURCE_(\d+)\]/g;
const citations: Citation[] = [];
let match;
while ((match = citationRegex.exec(answer)) !== null) {
const sourceIdx = parseInt(match[1]);
if (sourceIdx < context.length) {
citations.push({
sourceId: context[sourceIdx].id,
sourceText: context[sourceIdx].text,
claimIndex: citations.length,
confidence: 0.95,
});
}
}
return citations;
}
function removeCitationMarkers(text: string): string {
return text.replace(/\[SOURCE_\d+\]/g, '');
}
function calculateCoverage(answer: string, citations: Citation[]): number {
const cleanAnswer = removeCitationMarkers(answer);
return citations.length > 0 ? Math.min(1, citations.length / (cleanAnswer.split(/\s+/).length / 10)) : 0;
}
NLI-Based Faithfulness Scoring
Use Natural Language Inference to validate answers:
interface NLIResult {
entailment: number; // 0-1, answer entailed by context
contradiction: number; // 0-1, answer contradicted
neutral: number; // 0-1, neither
}
async function scoreWithNLI(
answer: string,
context: string[],
nliModel: any // Use: from transformers import pipeline; pipeline("zero-shot-classification")
): Promise<{
faithfulness: number;
issues: string[];
}> {
// Check each sentence of answer against context
const sentences = answer.match(/[^.!?]+[.!?]+/g) || [];
const nliScores: NLIResult[] = [];
for (const sentence of sentences) {
const premise = context.join(' ');
// NLI: does premise entail the hypothesis?
const result = await nliModel({
texts: sentence,
candidate_labels: ['entailment', 'contradiction', 'neutral'],
});
nliScores.push({
entailment: result.scores[0],
contradiction: result.scores[1],
neutral: result.scores[2],
});
}
// Aggregate scores
const avgEntailment = nliScores.reduce((sum, s) => sum + s.entailment, 0) / nliScores.length;
const contradictions = nliScores.filter(s => s.contradiction > 0.7);
const issues: string[] = [];
if (contradictions.length > 0) {
issues.push(`${contradictions.length} sentences may contradict the context`);
}
if (avgEntailment < 0.6) {
issues.push('Answer has weak grounding in provided context');
}
return {
faithfulness: avgEntailment,
issues,
};
}
Confidence-Weighted Responses
Return answer quality metrics to users:
interface ConfidenceWeightedAnswer {
answer: string;
confidenceScore: number; // 0-1
confidenceLevel: 'high' | 'medium' | 'low';
warnings: string[];
suggestions: string[];
}
async function generateConfidenceWeightedAnswer(
query: string,
context: Array<{ id: string; text: string }>,
llm: LLMClient
): Promise<ConfidenceWeightedAnswer> {
const withCitations = await generateAnswerWithCitations(query, context, llm);
const faithfulness = await validateAllCitations(withCitations, llm);
const hallucinations = await detectHallucinations(withCitations.answer, context.map(c => c.text), llm);
// Calculate confidence
let confidence = 1.0;
// Reduce by missing citations
confidence *= 1 - (0.2 * (1 - withCitations.citationCoverage));
// Reduce by validation issues
confidence *= faithfulness.overallFaithfulness;
// Reduce by hallucinations
if (hallucinations.hasHallucinations) {
const severityWeight = hallucinations.hallucinations.reduce((sum, h) => {
const weight = h.severity === 'high' ? 0.3 : h.severity === 'medium' ? 0.1 : 0.05;
return sum + weight;
}, 0);
confidence *= Math.max(0, 1 - severityWeight);
}
// Determine confidence level
let confidenceLevel: 'high' | 'medium' | 'low';
if (confidence > 0.85) {
confidenceLevel = 'high';
} else if (confidence > 0.65) {
confidenceLevel = 'medium';
} else {
confidenceLevel = 'low';
}
// Generate warnings and suggestions
const warnings: string[] = [];
const suggestions: string[] = [];
if (withCitations.citationCoverage < 0.8) {
warnings.push('Not all claims are properly cited');
suggestions.push('Please review the sources for uncited information');
}
if (faithfulness.invalidCitations > 0) {
warnings.push(`${faithfulness.invalidCitations} citations may not fully support the claims`);
}
if (hallucinations.hasHallucinations) {
hallucinations.hallucinations.forEach(h => {
warnings.push(`Potential ${h.type}: "${h.text}"`);
});
suggestions.push('Verify these claims against original sources');
}
if (confidenceLevel === 'low') {
suggestions.push('Consider retrieving additional context or rephrasing your question');
}
return {
answer: withCitations.answer,
confidenceScore: confidence,
confidenceLevel,
warnings,
suggestions,
};
}
Source Ranking by Reliability
Weight sources by trustworthiness:
interface SourceReliability {
sourceId: string;
domain?: string;
credibility: number; // 0-1
recentness: number; // 0-1
authority: number; // 0-1
combinedScore: number;
}
function rankSourcesByReliability(
sources: Array<{
id: string;
domain?: string;
lastUpdated?: number;
authority?: number; // Pre-computed authority score
}>
): SourceReliability[] {
const trustworthyDomains = ['wikipedia.org', 'scholar.google.com', '.edu', '.gov'];
const now = Date.now();
return sources.map(source => {
// Domain credibility
let credibility = 0.5; // Default
if (source.domain) {
if (trustworthyDomains.some(domain => source.domain!.includes(domain))) {
credibility = 0.9;
} else if (source.domain.includes('wikipedia')) {
credibility = 0.8;
} else if (source.domain.includes('blog')) {
credibility = 0.4;
}
}
// Recentness (decay over 2 years)
let recentness = 1.0;
if (source.lastUpdated) {
const daysSince = (now - source.lastUpdated) / (1000 * 60 * 60 * 24);
recentness = Math.max(0, 1 - daysSince / 730); // 730 days = 2 years
}
// Authority (pre-computed)
const authority = source.authority || 0.5;
// Combined score with weights
const combinedScore = credibility * 0.4 + recentness * 0.3 + authority * 0.3;
return {
sourceId: source.id,
domain: source.domain,
credibility,
recentness,
authority,
combinedScore,
};
});
}
// Use source reliability in citation weighting
function weightCitationsByReliability(
citations: Citation[],
sourceReliability: Map<string, SourceReliability>
): Citation[] {
return citations.map(citation => {
const reliability = sourceReliability.get(citation.sourceId);
const weight = reliability?.combinedScore || 0.5;
return {
...citation,
confidence: citation.confidence * weight,
};
});
}
Checklist
- Implement forced citation prompting
- Parse and extract citations from LLM output
- Validate citations against source text
- Detect hallucinations via NLI or LLM-as-judge
- Calculate citation coverage metrics
- Format citations for user display
- Add confidence weighting to responses
- Track source reliability scores
- Monitor faithfulness in production
- Set alerts for low-confidence responses
Conclusion
Citations transform RAG from a black box into a verifiable system. Force citations through prompting, validate them programmatically, and expose confidence metrics to users. This builds trust and catches hallucinations before they reach customers. The key metric: citation validation score. Aim for >90% of citations supporting their claims. Everything else—confidence levels, warnings, suggestions—flows from that foundation.