Published on

Hallucination Mitigation — Techniques to Make LLMs More Truthful

Authors

Introduction

Your LLM confidently tells the user "The capital of France is London." It sounds authoritative. It''s completely wrong. This is a hallucination: plausible but false content generated by the model.

Hallucinations erode trust faster than slow responses. This post covers techniques to reduce them: grounding via retrieval, self-consistency checks, and confidence thresholds.

Hallucination Types and Detection

Different hallucinations require different fixes:

type HallucinationType =
  | "factual" // Factually incorrect claim
  | "faithful" // Contradicts provided context
  | "extrinsic"; // Plausible but unverifiable claim

interface HallucinationDetection {
  isHallucination: boolean;
  type: HallucinationType;
  claim: string;
  confidence: number;
  reasoning: string;
}

class HallucinationDetector {
  async detectFactualHallucination(
    claim: string,
    knowledgeBase: string[]
  ): Promise<HallucinationDetection> {
    // Check if claim is contradicted by knowledge base
    const contradictions = knowledgeBase.filter((fact) =>
      this.contradicts(claim, fact)
    );

    if (contradictions.length > 0) {
      return {
        isHallucination: true,
        type: "factual",
        claim,
        confidence: Math.min(contradictions.length / 5, 1.0),
        reasoning: `Contradicted by: ${contradictions[0]}`,
      };
    }

    return {
      isHallucination: false,
      type: "factual",
      claim,
      confidence: 0.95,
      reasoning: "No contradictions found",
    };
  }

  async detectFaithfulnessHallucination(
    response: string,
    context: string
  ): Promise<HallucinationDetection> {
    // Check if response is supported by context
    const contextFacts = this.extractClaims(context);
    const responseFacts = this.extractClaims(response);

    const unsupported = responseFacts.filter(
      (fact) => !contextFacts.includes(fact)
    );

    if (unsupported.length > 0) {
      return {
        isHallucination: true,
        type: "faithful",
        claim: unsupported[0],
        confidence: 0.8,
        reasoning: `Claim not supported by context: "${unsupported[0]}"`,
      };
    }

    return {
      isHallucination: false,
      type: "faithful",
      claim: response,
      confidence: 0.95,
      reasoning: "All claims supported by context",
    };
  }

  private contradicts(claim: string, fact: string): boolean {
    // Simplified: check for opposite assertions
    return (
      claim.includes("is not") &&
      fact.replace("is not", "is") === claim.replace("is not", "is")
    );
  }

  private extractClaims(text: string): string[] {
    // Split by period, return non-empty claims
    return text
      .split(".")
      .map((s) => s.trim())
      .filter((s) => s.length > 10);
  }
}

export { HallucinationDetector, HallucinationType, HallucinationDetection };

Grounding via RAG (Retrieval-Augmented Generation)

The most effective hallucination reduction: retrieve relevant documents and ground responses in them:

interface RAGContext {
  query: string;
  retrievedDocuments: string[];
  sourceUrls: string[];
}

interface GroundedResponse {
  response: string;
  citations: Array<{ claim: string; source: string }>;
  confidence: number;
  isGrounded: boolean;
}

class RAGGrounding {
  async generateGroundedResponse(
    ragContext: RAGContext,
    apiKey: string
  ): Promise<GroundedResponse> {
    const context = ragContext.retrievedDocuments.join("\n\n");

    const message = await new (require("@anthropic-ai/sdk")).Anthropic({
      apiKey,
    }).messages.create({
      model: "claude-3-5-sonnet-20241022",
      max_tokens: 1024,
      system: `You are a truthful AI assistant. You MUST ground every claim in the provided context.
      
If a claim cannot be supported by the context, say "I don''t have information about that" instead of guessing.
Every statement must cite which document it came from.`,
      messages: [
        {
          role: "user",
          content: `Context:
${context}

Question: ${ragContext.query}

Answer only using the context provided. Cite your sources.`,
        },
      ],
    });

    const responseText =
      message.content[0].type === "text" ? message.content[0].text : "";

    // Extract citations from response
    const citationPattern = /\[(.+?)\]\((.+?)\)/g;
    const citations = [];
    let match;

    while ((match = citationPattern.exec(responseText)) !== null) {
      citations.push({ claim: match[1], source: match[2] });
    }

    return {
      response: responseText,
      citations,
      confidence: citations.length > 0 ? 0.95 : 0.6,
      isGrounded: citations.length > 0,
    };
  }
}

export { RAGGrounding, RAGContext, GroundedResponse };

Self-Consistency Sampling

Run the LLM multiple times on the same prompt and accept only consistent answers:

interface ConsistencyCheckResult {
  mostCommonResponse: string;
  consistency: number; // 0-1
  samples: string[];
  isConsistent: boolean;
}

class SelfConsistencySampler {
  async checkConsistency(
    prompt: string,
    numSamples: number = 5,
    consistencyThreshold: number = 0.6,
    apiKey: string
  ): Promise<ConsistencyCheckResult> {
    const client = new (require("@anthropic-ai/sdk")).Anthropic({
      apiKey,
    });

    const samples: string[] = [];

    // Call LLM multiple times with different random seeds
    for (let i = 0; i < numSamples; i++) {
      const message = await client.messages.create({
        model: "claude-3-5-sonnet-20241022",
        max_tokens: 256,
        temperature: 0.7, // Higher temperature = more variation
        messages: [{ role: "user", content: prompt }],
      });

      const text =
        message.content[0].type === "text" ? message.content[0].text : "";
      samples.push(text);
    }

    // Find most common response
    const responseCounts = new Map<string, number>();
    samples.forEach((s) => {
      responseCounts.set(s, (responseCounts.get(s) || 0) + 1);
    });

    const mostCommon = Array.from(responseCounts.entries()).sort(
      (a, b) => b[1] - a[1]
    )[0][0];

    const consistency = responseCounts.get(mostCommon)! / numSamples;
    const isConsistent = consistency >= consistencyThreshold;

    return {
      mostCommonResponse: mostCommon,
      consistency,
      samples,
      isConsistent,
    };
  }
}

export { SelfConsistencySampler, ConsistencyCheckResult };

Uncertainty Expression

Train the model to express uncertainty rather than hallucinate:

interface UncertaintyResponse {
  response: string;
  confidenceScore: number;
  uncertaintyExpressed: boolean;
  recommendation: "proceed" | "ask_user_to_clarify" | "use_fallback";
}

class UncertaintyExpressionFilter {
  private uncertaintyIndicators = [
    "I''m not sure",
    "I don''t know",
    "I don''t have information",
    "It''s unclear",
    "I cannot determine",
    "I might be wrong",
  ];

  async evaluateUncertainty(response: string): Promise<UncertaintyResponse> {
    const hasUncertaintyMarker = this.uncertaintyIndicators.some((indicator) =>
      response.toLowerCase().includes(indicator.toLowerCase())
    );

    const confidenceScore = hasUncertaintyMarker ? 0.3 : 0.8;

    let recommendation: "proceed" | "ask_user_to_clarify" | "use_fallback" =
      "proceed";

    if (hasUncertaintyMarker && response.length < 100) {
      recommendation = "ask_user_to_clarify";
    } else if (!hasUncertaintyMarker && response.length > 500) {
      // Long confident response might be hallucinating
      recommendation = "use_fallback";
    }

    return {
      response,
      confidenceScore,
      uncertaintyExpressed: hasUncertaintyMarker,
      recommendation,
    };
  }

  buildUncertaintyPrompt(userQuery: string): string {
    return `${userQuery}

If you''re not certain about your answer, say so explicitly. It''s better to admit uncertainty than to guess.`;
  }
}

export { UncertaintyExpressionFilter, UncertaintyResponse };

Chain-of-Thought for Factual Accuracy

Explicit reasoning reduces hallucinations:

interface ChainOfThoughtResult {
  reasoning: string;
  finalAnswer: string;
  steps: string[];
  hasLogicalFlaws: boolean;
}

class ChainOfThoughtReasoner {
  async reasonThroughProblem(
    problem: string,
    apiKey: string
  ): Promise<ChainOfThoughtResult> {
    const client = new (require("@anthropic-ai/sdk")).Anthropic({
      apiKey,
    });

    const message = await client.messages.create({
      model: "claude-3-5-sonnet-20241022",
      max_tokens: 1024,
      messages: [
        {
          role: "user",
          content: `Solve this step-by-step. Show your reasoning.

${problem}

Format: 
Step 1: [First step]
Step 2: [Second step]
...
Final Answer: [Answer]`,
        },
      ],
    });

    const responseText =
      message.content[0].type === "text" ? message.content[0].text : "";

    // Parse steps
    const stepPattern = /^Step \d+: (.+)$/gm;
    const steps: string[] = [];
    let match;

    while ((match = stepPattern.exec(responseText)) !== null) {
      steps.push(match[1]);
    }

    // Extract final answer
    const finalAnswerMatch = /Final Answer:\s*(.+?)(?:\n|$)/i.exec(responseText);
    const finalAnswer = finalAnswerMatch ? finalAnswerMatch[1] : "";

    // Check for logical consistency
    const hasFlaws = this.detectLogicalFlaws(steps);

    return {
      reasoning: responseText,
      finalAnswer,
      steps,
      hasLogicalFlaws: hasFlaws,
    };
  }

  private detectLogicalFlaws(steps: string[]): boolean {
    // Check for contradictory statements within steps
    const stepText = steps.join(" ").toLowerCase();

    const contradictions = [
      /(?:but not|however not|nevertheless not)/,
      /(?:always|never).*(?:sometimes|occasionally)/,
    ];

    return contradictions.some((pattern) => pattern.test(stepText));
  }
}

export { ChainOfThoughtReasoner, ChainOfThoughtResult };

Wikipedia Lookup Tool for Verification

Give the LLM a tool to verify facts:

interface WikipediaLookupResult {
  found: boolean;
  summary: string;
  url: string;
}

class WikipediaVerificationTool {
  async lookupFact(topic: string): Promise<WikipediaLookupResult> {
    // In production, use Wikipedia API
    // This is a simplified mock

    const wikiApiUrl = `https://en.wikipedia.org/api/rest_v1/page/summary/${encodeURIComponent(topic)}`;

    try {
      const response = await fetch(wikiApiUrl);
      const data = await response.json();

      if (data.type === "not found") {
        return {
          found: false,
          summary: "",
          url: "",
        };
      }

      return {
        found: true,
        summary: data.extract || "",
        url: data.content_urls.desktop.page,
      };
    } catch {
      return {
        found: false,
        summary: "",
        url: "",
      };
    }
  }

  buildVerificationPrompt(claim: string): string {
    return `Verify this claim by looking up relevant facts:
"${claim}"

If the claim is contradicted by reliable sources, say so explicitly.`;
  }
}

export { WikipediaVerificationTool, WikipediaLookupResult };

Conclusion

Hallucinations aren''t a bug in LLMs—they''re fundamental. You can''t eliminate them, but you can dramatically reduce their impact:

  1. Ground in facts: RAG is your most powerful weapon
  2. Express uncertainty: Teach models to admit when they don''t know
  3. Check consistency: Multiple samples catch outliers
  4. Explicit reasoning: Chain-of-thought surfaces logical flaws
  5. Verify with tools: Give LLMs access to knowledge sources

Combine these techniques, and you''ll build LLM applications users can trust.