Published on

Human-in-the-Loop AI — When and How to Involve Humans in AI Workflows

Authors
  • Name
    Twitter

Introduction

Not every request should be handled by AI alone. Some require human judgment, expertise, or approval. Rather than treating humans as fallback, design them as first-class citizens in your workflow. This post covers when to route to humans, how to build review queues, implement escalation tiers, and use human corrections to improve your AI system continuously.

Confidence-Based Human Routing

Route requests to humans based on AI confidence scores. High confidence goes to users, low confidence goes to humans.

import Anthropic from "@anthropic-ai/sdk";

interface AIResponse {
  content: string;
  confidenceScore: number; // 0-1
  reasoning: string;
}

interface RoutingDecision {
  routeToHuman: boolean;
  confidenceThreshold: number;
  reason: string;
}

class ConfidenceRouter {
  async generateResponseWithConfidence(
    userInput: string,
    confidenceThreshold: number = 0.7
  ): Promise<{ response: AIResponse; routing: RoutingDecision }> {
    const client = new Anthropic();

    // Generate initial response
    const response = await client.messages.create({
      model: "claude-3-5-sonnet-20241022",
      max_tokens: 1024,
      messages: [{ role: "user", content: userInput }],
    });

    const aiContent =
      response.content[0].type === "text" ? response.content[0].text : "";

    // Generate confidence assessment
    const confidenceAssessment = await client.messages.create({
      model: "claude-3-5-sonnet-20241022",
      max_tokens: 256,
      messages: [
        {
          role: "user",
          content: `Rate confidence in this response (0-1 scale) and explain why:\n\nQuestion: ${userInput}\n\nResponse: ${aiContent}`,
        },
      ],
    });

    const confidenceText =
      confidenceAssessment.content[0].type === "text"
        ? confidenceAssessment.content[0].text
        : "";

    const confidenceScore = this.extractConfidenceScore(confidenceText);

    const aiResponse: AIResponse = {
      content: aiContent,
      confidenceScore,
      reasoning: confidenceText,
    };

    const routing: RoutingDecision = {
      routeToHuman: confidenceScore < confidenceThreshold,
      confidenceThreshold,
      reason:
        confidenceScore < confidenceThreshold
          ? "Confidence below threshold"
          : "High confidence, ready for user",
    };

    return { response: aiResponse, routing };
  }

  private extractConfidenceScore(confidenceText: string): number {
    // Extract score from confidence assessment
    const match = confidenceText.match(/(\d+(?:\.\d+)?)/);
    if (match) {
      const score = parseFloat(match[1]);
      return Math.min(Math.max(score / 100, 0), 1); // Normalize to 0-1
    }
    return 0.5; // Default if extraction fails
  }
}

Human Approval Before AI Action Execution

For high-stakes operations, require human approval before executing AI-recommended actions.

interface ApprovalRequest {
  requestId: string;
  userId: string;
  aiRecommendation: string;
  reason: string;
  createdAt: Date;
  expiresAt: Date;
  priority: "low" | "medium" | "high";
}

interface ApprovalResponse {
  requestId: string;
  approvedBy: string;
  decision: "approved" | "rejected" | "needs_revision";
  comments?: string;
  respondedAt: Date;
}

class ApprovalWorkflow {
  private pendingRequests = new Map<string, ApprovalRequest>();
  private approvalResponses = new Map<string, ApprovalResponse>();

  async submitForApproval(
    userId: string,
    recommendation: string,
    priority: "low" | "medium" | "high" = "medium"
  ): Promise<ApprovalRequest> {
    const request: ApprovalRequest = {
      requestId: `approval-${Date.now()}`,
      userId,
      aiRecommendation: recommendation,
      reason: `AI-generated recommendation for user ${userId}`,
      createdAt: new Date(),
      expiresAt: new Date(Date.now() + 24 * 60 * 60 * 1000), // 24h expiration
      priority,
    };

    this.pendingRequests.set(request.requestId, request);

    // Send to approval queue (Slack, email, dashboard, etc.)
    await this.notifyApprovers(request);

    return request;
  }

  processApproval(
    requestId: string,
    approver: string,
    decision: "approved" | "rejected" | "needs_revision",
    comments?: string
  ): ApprovalResponse {
    const request = this.pendingRequests.get(requestId);

    if (!request) {
      throw new Error(`Request not found: ${requestId}`);
    }

    const response: ApprovalResponse = {
      requestId,
      approvedBy: approver,
      decision,
      comments,
      respondedAt: new Date(),
    };

    this.approvalResponses.set(requestId, response);
    this.pendingRequests.delete(requestId); // Remove from pending

    return response;
  }

  async executeIfApproved(
    requestId: string,
    executeAction: () => Promise<unknown>
  ): Promise<{ executed: boolean; result?: unknown; error?: string }> {
    const approval = this.approvalResponses.get(requestId);

    if (!approval) {
      return { executed: false, error: "Approval not found" };
    }

    if (approval.decision === "approved") {
      try {
        const result = await executeAction();
        return { executed: true, result };
      } catch (error) {
        return {
          executed: false,
          error: error instanceof Error ? error.message : String(error),
        };
      }
    }

    return {
      executed: false,
      error: `Request was ${approval.decision}${
        approval.comments ? ": " + approval.comments : ""
      }`,
    };
  }

  getApprovalStatus(requestId: string): ApprovalRequest | ApprovalResponse | null {
    return (
      this.pendingRequests.get(requestId) ||
      this.approvalResponses.get(requestId) ||
      null
    );
  }

  private async notifyApprovers(request: ApprovalRequest): Promise<void> {
    // Implementation: send to Slack, email, or approval dashboard
    console.log(`Approval request created: ${request.requestId}`);
  }
}

Review Queue Implementation

Manage human review work with queues prioritized by urgency and type.

interface ReviewItem {
  itemId: string;
  type: "ai_response" | "content" | "moderation" | "feature";
  content: string;
  submittedAt: Date;
  priority: "low" | "medium" | "high" | "critical";
  assignee?: string;
  status: "pending" | "in_review" | "completed";
  metadata: Record<string, unknown>;
}

interface ReviewResult {
  itemId: string;
  reviewer: string;
  decision: "approved" | "rejected" | "revision_needed";
  feedback: string;
  completedAt: Date;
}

class ReviewQueue {
  private queue: ReviewItem[] = [];
  private results: ReviewResult[] = [];

  addToQueue(item: Omit<ReviewItem, "status" | "submittedAt">): ReviewItem {
    const reviewItem: ReviewItem = {
      ...item,
      status: "pending",
      submittedAt: new Date(),
    };

    this.queue.push(reviewItem);
    this.sortQueue();

    return reviewItem;
  }

  private sortQueue(): void {
    // Sort by priority, then by submission time
    const priorityOrder = { critical: 0, high: 1, medium: 2, low: 3 };

    this.queue.sort((a, b) => {
      const priorityDiff =
        priorityOrder[a.priority] - priorityOrder[b.priority];
      if (priorityDiff !== 0) return priorityDiff;
      return a.submittedAt.getTime() - b.submittedAt.getTime();
    });
  }

  claimForReview(reviewer: string): ReviewItem | null {
    const nextItem = this.queue.find((item) => item.status === "pending");

    if (!nextItem) return null;

    nextItem.assignee = reviewer;
    nextItem.status = "in_review";

    return nextItem;
  }

  submitReview(
    itemId: string,
    reviewer: string,
    decision: "approved" | "rejected" | "revision_needed",
    feedback: string
  ): ReviewResult {
    const item = this.queue.find((i) => i.itemId === itemId);

    if (!item) {
      throw new Error(`Review item not found: ${itemId}`);
    }

    const result: ReviewResult = {
      itemId,
      reviewer,
      decision,
      feedback,
      completedAt: new Date(),
    };

    this.results.push(result);
    item.status = "completed";

    return result;
  }

  getQueueStats(): {
    total: number;
    pending: number;
    inReview: number;
    completed: number;
    avgReviewTimeMinutes: number;
  } {
    const pendingCount = this.queue.filter(
      (i) => i.status === "pending"
    ).length;
    const inReviewCount = this.queue.filter(
      (i) => i.status === "in_review"
    ).length;
    const completedCount = this.results.length;

    const reviewTimes = this.results.map(
      (r) =>
        this.queue.find((i) => i.itemId === r.itemId)?.submittedAt.getTime() ||
        0
    );
    const avgTime =
      reviewTimes.length > 0
        ? reviewTimes.reduce((a, b) => a + b, 0) / reviewTimes.length / 60000
        : 0;

    return {
      total: this.queue.length,
      pending: pendingCount,
      inReview: inReviewCount,
      completed: completedCount,
      avgReviewTimeMinutes: avgTime,
    };
  }
}

Async Human Feedback Collection

Collect human feedback asynchronously without blocking user experience.

interface FeedbackRequest {
  feedbackId: string;
  aiResponseId: string;
  userId: string;
  question: string;
  aiResponse: string;
  sentAt: Date;
  responseDeadline: Date;
}

interface HumanFeedback {
  feedbackId: string;
  type: "rating" | "text" | "choice";
  rating?: number; // 1-5
  text?: string;
  choice?: string;
  respondedAt: Date;
  respondent?: string;
}

class AsyncFeedbackCollector {
  private sentRequests = new Map<string, FeedbackRequest>();
  private receivedFeedback = new Map<string, HumanFeedback>();

  async sendFeedbackRequest(
    userId: string,
    aiResponseId: string,
    question: string,
    aiResponse: string
  ): Promise<FeedbackRequest> {
    const request: FeedbackRequest = {
      feedbackId: `feedback-${Date.now()}`,
      aiResponseId,
      userId,
      question,
      aiResponse,
      sentAt: new Date(),
      responseDeadline: new Date(Date.now() + 7 * 24 * 60 * 60 * 1000), // 7 days
    };

    this.sentRequests.set(request.feedbackId, request);

    // Send via email, in-app notification, etc.
    await this.deliverFeedbackRequest(request);

    return request;
  }

  recordFeedback(
    feedbackId: string,
    feedback: HumanFeedback
  ): void {
    const request = this.sentRequests.get(feedbackId);

    if (!request) {
      throw new Error(`Feedback request not found: ${feedbackId}`);
    }

    this.receivedFeedback.set(feedbackId, feedback);
  }

  getBatchFeedback(
    limit: number = 100
  ): Array<{ request: FeedbackRequest; feedback: HumanFeedback }> {
    const results = [];
    let count = 0;

    for (const [feedbackId, feedback] of this.receivedFeedback.entries()) {
      const request = this.sentRequests.get(feedbackId);

      if (request && count < limit) {
        results.push({ request, feedback });
        count++;
      }
    }

    return results;
  }

  getFeedbackRate(): {
    sent: number;
    received: number;
    responseRate: number;
  } {
    const sent = this.sentRequests.size;
    const received = this.receivedFeedback.size;
    const responseRate = sent > 0 ? (received / sent) * 100 : 0;

    return { sent, received, responseRate };
  }

  private async deliverFeedbackRequest(request: FeedbackRequest): Promise<void> {
    // Implementation: send via email, Slack, push notification, etc.
    console.log(`Feedback request sent: ${request.feedbackId}`);
  }
}

Active Learning from Human Corrections

Use human corrections to improve model behavior on similar future requests.

interface CorrectionEvent {
  aiResponseId: string;
  originalResponse: string;
  humanCorrection: string;
  userQuery: string;
  timestamp: Date;
  similarityToOthers: number; // How similar to other corrections
}

interface LearningPattern {
  pattern: string;
  frequency: number;
  affectedQueries: string[];
  suggestedFix: string;
}

class ActiveLearner {
  private corrections: CorrectionEvent[] = [];

  recordCorrection(
    query: string,
    aiResponse: string,
    humanCorrection: string
  ): CorrectionEvent {
    const event: CorrectionEvent = {
      aiResponseId: `response-${Date.now()}`,
      originalResponse: aiResponse,
      humanCorrection,
      userQuery: query,
      timestamp: new Date(),
      similarityToOthers: 0,
    };

    this.corrections.push(event);

    return event;
  }

  identifyLearningPatterns(): LearningPattern[] {
    const patterns: Map<string, LearningPattern> = new Map();

    // Group corrections by similarity
    this.corrections.forEach((correction) => {
      const key = this.extractPattern(correction.originalResponse);

      if (!patterns.has(key)) {
        patterns.set(key, {
          pattern: key,
          frequency: 0,
          affectedQueries: [],
          suggestedFix: "",
        });
      }

      const pattern = patterns.get(key)!;
      pattern.frequency++;
      pattern.affectedQueries.push(correction.userQuery);
    });

    // Filter to high-frequency patterns worth learning from
    return Array.from(patterns.values())
      .filter((p) => p.frequency >= 5) // Minimum 5 corrections
      .sort((a, b) => b.frequency - a.frequency);
  }

  private extractPattern(text: string): string {
    // Simple pattern extraction: look for common phrases
    const words = text.split(" ").slice(0, 5);
    return words.join(" ");
  }

  async generateImprovedPrompt(
    pattern: LearningPattern
  ): Promise<string> {
    const client = new Anthropic();

    const prompt = `Given these corrections:\n${pattern.affectedQueries.join(
      "\n"
    )}\n\nGenerate an improved system prompt or instruction to avoid this error.`;

    const response = await client.messages.create({
      model: "claude-3-5-sonnet-20241022",
      max_tokens: 512,
      messages: [{ role: "user", content: prompt }],
    });

    return response.content[0].type === "text" ? response.content[0].text : "";
  }

  getCorrectionStats(): {
    totalCorrections: number;
    uniquePatterns: number;
    correctionRate: number;
  } {
    const patterns = this.identifyLearningPatterns();

    return {
      totalCorrections: this.corrections.length,
      uniquePatterns: patterns.length,
      correctionRate: (patterns.length / this.corrections.length) * 100,
    };
  }
}

Escalation Tiers

Route complex requests through escalation tiers: AI → human agent → expert.

type EscalationLevel = "ai" | "agent" | "expert";

interface EscalationRequest {
  requestId: string;
  originalQuery: string;
  currentLevel: EscalationLevel;
  previousContext: string[];
  escalatedAt: Date;
  reason: string;
}

class EscalationManager {
  private escalationQueue: EscalationRequest[] = [];
  private assignedTo = new Map<string, string>(); // requestId -> assignee

  shouldEscalate(
    response: string,
    confidenceScore: number,
    userFeedback?: string
  ): boolean {
    // Escalate if confidence too low
    if (confidenceScore < 0.3) return true;

    // Escalate if user explicitly requests it
    if (userFeedback && userFeedback.includes("escalate")) return true;

    // Escalate if response indicates complexity
    if (response.includes("I'm unable to") || response.includes("unsure")) {
      return true;
    }

    return false;
  }

  escalate(
    requestId: string,
    originalQuery: string,
    currentLevel: EscalationLevel,
    previousContext: string[],
    reason: string
  ): EscalationRequest {
    const nextLevel: EscalationLevel =
      currentLevel === "ai" ? "agent" : "expert";

    const request: EscalationRequest = {
      requestId,
      originalQuery,
      currentLevel: nextLevel,
      previousContext: [...previousContext, `Escalated from ${currentLevel}`],
      escalatedAt: new Date(),
      reason,
    };

    this.escalationQueue.push(request);

    // Route to appropriate team
    if (nextLevel === "agent") {
      this.routeToAgent(request);
    } else {
      this.routeToExpert(request);
    }

    return request;
  }

  assignEscalation(requestId: string, handler: string): void {
    this.assignedTo.set(requestId, handler);
  }

  resolveEscalation(
    requestId: string,
    resolution: string
  ): { requestId: string; resolution: string; resolvedAt: Date } {
    this.escalationQueue = this.escalationQueue.filter(
      (r) => r.requestId !== requestId
    );

    return {
      requestId,
      resolution,
      resolvedAt: new Date(),
    };
  }

  private routeToAgent(request: EscalationRequest): void {
    // Send to agent queue
    console.log(`Routing to agent: ${request.requestId}`);
  }

  private routeToExpert(request: EscalationRequest): void {
    // Send to expert queue
    console.log(`Routing to expert: ${request.requestId}`);
  }
}

Labeling Interface Design

Design user-friendly interfaces for collecting high-quality labels.

type LabelType = "text" | "choice" | "rating" | "span" | "bounding-box";

interface LabelingTask {
  taskId: string;
  content: string;
  labelType: LabelType;
  options?: string[]; // For choice type
  instructions: string;
  reward?: number; // In cents
}

interface Label {
  labelId: string;
  taskId: string;
  labeledBy: string;
  value: unknown;
  confidence: number;
  labeledAt: Date;
}

class LabelingInterface {
  private tasks: LabelingTask[] = [];
  private labels: Label[] = [];

  createLabelingTask(
    content: string,
    labelType: LabelType,
    instructions: string,
    options?: string[]
  ): LabelingTask {
    const task: LabelingTask = {
      taskId: `task-${Date.now()}`,
      content,
      labelType,
      options,
      instructions,
      reward: 5, // Default 5 cents
    };

    this.tasks.push(task);
    return task;
  }

  getTaskForLabeler(labelerId: string): LabelingTask | null {
    // Return next unlabeled task
    return this.tasks.find(
      (task) =>
        !this.labels.some(
          (label) => label.taskId === task.taskId && label.labeledBy === labelerId
        )
    ) || null;
  }

  submitLabel(
    taskId: string,
    labelerId: string,
    value: unknown,
    confidence: number
  ): Label {
    const label: Label = {
      labelId: `label-${Date.now()}`,
      taskId,
      labeledBy: labelerId,
      value,
      confidence,
      labeledAt: new Date(),
    };

    this.labels.push(label);
    return label;
  }

  getAggregatedLabels(taskId: string): {
    consensusValue: unknown;
    agreementScore: number;
  } {
    const taskLabels = this.labels.filter((l) => l.taskId === taskId);

    if (taskLabels.length === 0) {
      return { consensusValue: null, agreementScore: 0 };
    }

    // For text labels, find most common
    const valueCounts = new Map<string, number>();
    taskLabels.forEach((label) => {
      const val = String(label.value);
      valueCounts.set(val, (valueCounts.get(val) || 0) + 1);
    });

    const consensusValue = Array.from(valueCounts.entries()).sort(
      ([, a], [, b]) => b - a
    )[0][0];

    const agreementScore = Math.max(...valueCounts.values()) / taskLabels.length;

    return { consensusValue, agreementScore };
  }
}

Checklist

  • Route low-confidence requests to humans automatically
  • Require human approval for high-stakes AI recommendations
  • Implement prioritized review queues for human work
  • Collect feedback asynchronously without blocking users
  • Use human corrections to identify patterns and improve prompts
  • Design escalation tiers: AI → agent → expert
  • Build user-friendly labeling interfaces with clear instructions
  • Continuously learn from human feedback to improve AI behavior

Conclusion

Human-in-the-loop systems shouldn''t treat humans as a fallback—they''re essential to building trustworthy AI products. Route requests based on confidence scores, require approvals for high-impact decisions, and manage review queues efficiently. Collect feedback asynchronously so it doesn''t block users. Most importantly, close the loop by analyzing human corrections to identify patterns and improve your AI system''s behavior on similar future requests. With these patterns in place, you''ll build AI systems that scale while maintaining human judgment for the decisions that matter most.