Building LLM Agents With Tool Use — Reliable Agentic Workflows for Production

Introduction

LLM agents are powerful but fragile. Tools fail, agents hallucinate tool calls, and chains break silently. This guide covers production patterns for building reliable agents with validation, error recovery, and observability.

Tool Definition Design
Parallel Tool Calling
Tool Result Validation
Agent Loop Termination Conditions
Human-in-the-Loop Checkpoints
State Management Across Tool Calls
Error Recovery and Fallbacks
Observability for Multi-Step Agent Traces
Checklist
Conclusion

Tool Definition Design

Define tools with clear descriptions, typed parameters, and strict validation.

interface ToolParameter {
  name: string;
  type: 'string' | 'number' | 'boolean' | 'array' | 'object';
  description: string;
  required: boolean;
  enum?: unknown[];
}

interface ToolDefinition {
  name: string;
  description: string;
  parameters: ToolParameter[];
  handler: (args: Record<string, unknown>) => Promise<string>;
}

class ToolRegistry {
  private tools: Map<string, ToolDefinition> = new Map();

  registerTool(tool: ToolDefinition): void {
    if (this.tools.has(tool.name)) {
      throw new Error(`Tool ${tool.name} already registered`);
    }

    this.tools.set(tool.name, tool);
  }

  getTool(name: string): ToolDefinition | undefined {
    return this.tools.get(name);
  }

  listTools(): ToolDefinition[] {
    return Array.from(this.tools.values());
  }

  async executeTool(name: string, args: Record<string, unknown>): Promise<string> {
    const tool = this.tools.get(name);
    if (!tool) {
      throw new Error(`Tool ${name} not found`);
    }

    // Validate parameters
    for (const param of tool.parameters) {
      if (param.required && !(param.name in args)) {
        throw new Error(`Missing required parameter: ${param.name}`);
      }

      if (param.name in args) {
        const value = args[param.name];
        if (typeof value !== param.type && param.type !== 'object') {
          throw new Error(`Parameter ${param.name} should be ${param.type}, got ${typeof value}`);
        }

        if (param.enum && !param.enum.includes(value)) {
          throw new Error(`Parameter ${param.name} must be one of: ${param.enum.join(', ')}`);
        }
      }
    }

    return tool.handler(args);
  }
}

const registry = new ToolRegistry();

registry.registerTool({
  name: 'search_database',
  description: 'Search company database for information',
  parameters: [
    { name: 'query', type: 'string', description: 'Search query', required: true },
    { name: 'limit', type: 'number', description: 'Max results', required: false },
  ],
  handler: async (args) => {
    const query = args.query as string;
    const limit = (args.limit as number) || 10;
    return `Found ${limit} results for "${query}"`;
  },
});

registry.registerTool({
  name: 'send_email',
  description: 'Send an email message',
  parameters: [
    { name: 'to', type: 'string', description: 'Recipient email', required: true },
    { name: 'subject', type: 'string', description: 'Email subject', required: true },
    { name: 'body', type: 'string', description: 'Email body', required: true },
  ],
  handler: async (args) => {
    console.log(`Sending email to ${args.to}`);
    return 'Email sent successfully';
  },
});

const result = await registry.executeTool('search_database', { query: 'customers', limit: 5 });

Parallel Tool Calling

Execute multiple independent tools in parallel for efficiency.

class ParallelToolExecutor {
  constructor(private registry: ToolRegistry) {}

  async executeParallel(
    calls: Array<{ toolName: string; args: Record<string, unknown> }>
  ): Promise<Array<{ toolName: string; result: string; error?: string }>> {
    const promises = calls.map(async (call) => {
      try {
        const result = await this.registry.executeTool(call.toolName, call.args);
        return { toolName: call.toolName, result };
      } catch (error) {
        return {
          toolName: call.toolName,
          result: '',
          error: error instanceof Error ? error.message : String(error),
        };
      }
    });

    return Promise.all(promises);
  }

  async executeDependencyGraph(
    graph: Map<string, { toolName: string; args: Record<string, unknown>; dependsOn?: string[] }>
  ): Promise<Map<string, string>> {
    const results = new Map<string, string>();
    const executing = new Set<string>();
    const completed = new Set<string>();

    const execute = async (key: string): Promise<string> => {
      if (completed.has(key)) {
        return results.get(key) || '';
      }

      if (executing.has(key)) {
        throw new Error(`Circular dependency detected at ${key}`);
      }

      const node = graph.get(key);
      if (!node) throw new Error(`Unknown key: ${key}`);

      executing.add(key);

      // Execute dependencies first
      if (node.dependsOn) {
        for (const dep of node.dependsOn) {
          await execute(dep);
        }
      }

      try {
        const result = await this.registry.executeTool(node.toolName, node.args);
        results.set(key, result);
        completed.add(key);
        executing.delete(key);
        return result;
      } catch (error) {
        executing.delete(key);
        throw error;
      }
    };

    for (const key of graph.keys()) {
      await execute(key);
    }

    return results;
  }
}

const executor = new ParallelToolExecutor(registry);

const results = await executor.executeParallel([
  { toolName: 'search_database', args: { query: 'customers' } },
  { toolName: 'search_database', args: { query: 'invoices' } },
]);

console.log('Parallel results:', results);

Tool Result Validation

Validate tool outputs before passing to next agent step.

interface ValidationResult {
  valid: boolean;
  errors: string[];
  warnings: string[];
}

class ToolResultValidator {
  validate(
    toolName: string,
    result: string,
    schema?: { type: string; pattern?: RegExp; minLength?: number; maxLength?: number }
  ): ValidationResult {
    const errors: string[] = [];
    const warnings: string[] = [];

    if (!result || result.trim() === '') {
      errors.push(`Tool ${toolName} returned empty result`);
    }

    if (result.includes('error') || result.toLowerCase().includes('failed')) {
      warnings.push('Result indicates potential failure');
    }

    if (schema) {
      if (schema.type === 'string') {
        if (schema.minLength && result.length < schema.minLength) {
          errors.push(`Result is too short (${result.length} < ${schema.minLength})`);
        }

        if (schema.maxLength && result.length > schema.maxLength) {
          errors.push(`Result is too long (${result.length} > ${schema.maxLength})`);
        }

        if (schema.pattern && !schema.pattern.test(result)) {
          errors.push(`Result does not match expected pattern`);
        }
      }
    }

    return {
      valid: errors.length === 0,
      errors,
      warnings,
    };
  }

  async validateWithRetry(
    executor: (attempt: number) => Promise<string>,
    toolName: string,
    maxRetries: number = 3
  ): Promise<{ result: string; attempts: number }> {
    for (let attempt = 1; attempt <= maxRetries; attempt++) {
      try {
        const result = await executor(attempt);
        const validation = this.validate(toolName, result);

        if (validation.valid) {
          return { result, attempts: attempt };
        }

        if (attempt < maxRetries) {
          console.log(`Validation failed, retrying... (${attempt}/${maxRetries})`);
        } else {
          throw new Error(`Validation failed after ${maxRetries} attempts: ${validation.errors.join(', ')}`);
        }
      } catch (error) {
        if (attempt === maxRetries) throw error;
      }
    }

    throw new Error('Exhausted retry attempts');
  }
}

const validator = new ToolResultValidator();

const validation = validator.validate('search_database', 'Found 5 customers', {
  type: 'string',
  minLength: 5,
});

console.log('Validation result:', validation);

Agent Loop Termination Conditions

Define clear stopping conditions for agent loops to prevent infinite execution.

interface AgentState {
  history: Array<{ role: 'user' | 'assistant' | 'tool'; content: string; timestamp: Date }>;
  toolCalls: number;
  lastAction?: string;
  stepCount: number;
}

class TerminationManager {
  private maxSteps = 20;
  private maxToolCalls = 50;
  private timeoutMs = 300000; // 5 minutes

  checkTermination(
    state: AgentState,
    startTime: number,
    successCondition?: (state: AgentState) => boolean
  ): { shouldTerminate: boolean; reason?: string } {
    // Check max steps
    if (state.stepCount >= this.maxSteps) {
      return { shouldTerminate: true, reason: `Max steps reached (${this.maxSteps})` };
    }

    // Check max tool calls
    if (state.toolCalls >= this.maxToolCalls) {
      return { shouldTerminate: true, reason: `Max tool calls exceeded (${this.maxToolCalls})` };
    }

    // Check timeout
    if (Date.now() - startTime > this.timeoutMs) {
      return { shouldTerminate: true, reason: `Timeout exceeded (${this.timeoutMs}ms)` };
    }

    // Check success condition
    if (successCondition && successCondition(state)) {
      return { shouldTerminate: true, reason: 'Success condition met' };
    }

    // Check for repeated actions (infinite loop)
    if (state.history.length >= 3) {
      const recent = state.history.slice(-3).map((h) => h.content);
      if (recent[0] === recent[1] && recent[1] === recent[2]) {
        return { shouldTerminate: true, reason: 'Repetitive action detected' };
      }
    }

    return { shouldTerminate: false };
  }

  resetLimits(maxSteps?: number, maxToolCalls?: number, timeoutMs?: number): void {
    if (maxSteps !== undefined) this.maxSteps = maxSteps;
    if (maxToolCalls !== undefined) this.maxToolCalls = maxToolCalls;
    if (timeoutMs !== undefined) this.timeoutMs = timeoutMs;
  }
}

const terminator = new TerminationManager();
const state: AgentState = {
  history: [
    { role: 'user', content: 'Find a customer', timestamp: new Date() },
    { role: 'assistant', content: 'Searching...', timestamp: new Date() },
  ],
  toolCalls: 5,
  stepCount: 2,
};

const check = terminator.checkTermination(state, Date.now());
console.log('Termination check:', check);

Human-in-the-Loop Checkpoints

Pause agent execution for human approval at critical steps.

interface ApprovalRequest {
  id: string;
  action: string;
  toolName: string;
  parameters: Record<string, unknown>;
  risk: 'low' | 'medium' | 'high';
  timestamp: Date;
  approvedAt?: Date;
  approvedBy?: string;
}

class HumanApprovalGate {
  private pendingRequests = new Map<string, ApprovalRequest>();
  private requiresApproval: Set<string> = new Set(['send_email', 'delete_record', 'transfer_funds']);

  needsApproval(toolName: string): boolean {
    return this.requiresApproval.has(toolName);
  }

  async requestApproval(
    toolName: string,
    parameters: Record<string, unknown>
  ): Promise<ApprovalRequest> {
    const request: ApprovalRequest = {
      id: `approval_${Date.now()}_${Math.random()}`,
      action: `Execute ${toolName}`,
      toolName,
      parameters,
      risk: this.assessRisk(toolName, parameters),
      timestamp: new Date(),
    };

    this.pendingRequests.set(request.id, request);
    console.log(`[APPROVAL REQUIRED] ${request.action} (Risk: ${request.risk})`);

    // In production, this would notify a human reviewer
    // and wait for response via API
    return request;
  }

  async waitForApproval(requestId: string, timeoutMs: number = 300000): Promise<boolean> {
    const startTime = Date.now();

    while (Date.now() - startTime < timeoutMs) {
      const request = this.pendingRequests.get(requestId);
      if (request?.approvedAt) {
        return true;
      }

      await new Promise((resolve) => setTimeout(resolve, 1000));
    }

    throw new Error(`Approval timeout for request ${requestId}`);
  }

  approveRequest(requestId: string, approvedBy: string): void {
    const request = this.pendingRequests.get(requestId);
    if (!request) throw new Error(`Request ${requestId} not found`);

    request.approvedAt = new Date();
    request.approvedBy = approvedBy;
  }

  rejectRequest(requestId: string): void {
    this.pendingRequests.delete(requestId);
  }

  private assessRisk(toolName: string, parameters: Record<string, unknown>): 'low' | 'medium' | 'high' {
    if (['send_email', 'update_record'].includes(toolName)) return 'medium';
    if (['delete_record', 'transfer_funds'].includes(toolName)) return 'high';
    return 'low';
  }
}

const approvalGate = new HumanApprovalGate();

if (approvalGate.needsApproval('send_email')) {
  const request = await approvalGate.requestApproval('send_email', {
    to: 'user@example.com',
    subject: 'Important notification',
  });

  // Simulate approval
  setTimeout(() => {
    approvalGate.approveRequest(request.id, 'admin@example.com');
  }, 1000);

  await approvalGate.waitForApproval(request.id);
}

State Management Across Tool Calls

Maintain and update agent state reliably across multiple tool executions.

interface StepResult {
  toolName: string;
  input: Record<string, unknown>;
  output: string;
  timestamp: Date;
  duration: number;
  success: boolean;
}

class AgentStateManager {
  private state: AgentState;
  private stepResults: StepResult[] = [];

  constructor(initialPrompt: string) {
    this.state = {
      history: [{ role: 'user', content: initialPrompt, timestamp: new Date() }],
      toolCalls: 0,
      stepCount: 0,
    };
  }

  recordToolCall(
    toolName: string,
    input: Record<string, unknown>,
    output: string,
    durationMs: number,
    success: boolean
  ): void {
    const result: StepResult = {
      toolName,
      input,
      output,
      timestamp: new Date(),
      duration: durationMs,
      success,
    };

    this.stepResults.push(result);
    this.state.toolCalls++;
    this.state.stepCount++;
    this.state.lastAction = toolName;

    this.state.history.push({
      role: 'tool',
      content: `${toolName} result: ${output}`,
      timestamp: new Date(),
    });
  }

  recordAgentThought(thought: string): void {
    this.state.history.push({
      role: 'assistant',
      content: thought,
      timestamp: new Date(),
    });
  }

  getState(): AgentState {
    return { ...this.state };
  }

  getSummary(): string {
    return `Steps: ${this.state.stepCount}, Tool calls: ${this.state.toolCalls}, Success rate: ${this.getSuccessRate().toFixed(2)}%`;
  }

  getSuccessRate(): number {
    if (this.stepResults.length === 0) return 100;
    const successful = this.stepResults.filter((r) => r.success).length;
    return (successful / this.stepResults.length) * 100;
  }

  getToolCallSequence(): string[] {
    return this.stepResults.map((r) => r.toolName);
  }
}

const stateManager = new AgentStateManager('Find a customer by email');
stateManager.recordToolCall('search_database', { query: 'john@example.com' }, 'Found customer ID 123', 150, true);
stateManager.recordAgentThought('Found the customer, now fetching details');
stateManager.recordToolCall('get_customer', { id: '123' }, 'Customer details loaded', 200, true);

console.log('Agent summary:', stateManager.getSummary());

Error Recovery and Fallbacks

Handle tool failures with retries, fallbacks, and graceful degradation.

class ErrorRecoveryStrategy {
  async executeWithFallback<T>(
    primary: () => Promise<T>,
    fallback?: () => Promise<T>,
    onError?: (error: Error) => void
  ): Promise<T> {
    try {
      return await primary();
    } catch (error) {
      const err = error instanceof Error ? error : new Error(String(error));
      if (onError) onError(err);

      if (fallback) {
        console.log('Primary failed, using fallback...');
        return fallback();
      }

      throw error;
    }
  }

  async executeWithExponentialBackoff<T>(
    fn: () => Promise<T>,
    maxRetries: number = 3,
    initialDelayMs: number = 1000
  ): Promise<T> {
    let lastError: Error | null = null;

    for (let attempt = 0; attempt < maxRetries; attempt++) {
      try {
        return await fn();
      } catch (error) {
        lastError = error instanceof Error ? error : new Error(String(error));

        if (attempt < maxRetries - 1) {
          const delay = initialDelayMs * Math.pow(2, attempt) + Math.random() * 100;
          console.log(`Attempt ${attempt + 1} failed, retrying in ${delay}ms...`);
          await new Promise((resolve) => setTimeout(resolve, delay));
        }
      }
    }

    throw lastError || new Error('All retries failed');
  }

  async executeWithCircuitBreaker<T>(
    fn: () => Promise<T>,
    failureThreshold: number = 5
  ): Promise<T> {
    let failures = 0;
    const isOpen = false;

    if (isOpen && failures >= failureThreshold) {
      throw new Error('Circuit breaker is open');
    }

    try {
      const result = await fn();
      failures = 0;
      return result;
    } catch (error) {
      failures++;
      throw error;
    }
  }
}

const recovery = new ErrorRecoveryStrategy();

const result = await recovery.executeWithFallback(
  async () => {
    throw new Error('Primary failed');
  },
  async () => {
    return 'Fallback result';
  }
);

console.log('Result:', result);

Observability for Multi-Step Agent Traces

Log and trace agent execution for debugging and analysis.

class AgentTracer {
  private traces: Array<{ agentId: string; steps: Array<{ timestamp: Date; action: string; data: unknown }> }> = [];

  startTrace(agentId: string): string {
    const trace = { agentId, steps: [] };
    this.traces.push(trace);
    return agentId;
  }

  recordStep(agentId: string, action: string, data: unknown): void {
    const trace = this.traces.find((t) => t.agentId === agentId);
    if (!trace) return;

    trace.steps.push({ timestamp: new Date(), action, data });
    console.log(`[${agentId}] ${action}`, data);
  }

  getTrace(agentId: string): unknown {
    return this.traces.find((t) => t.agentId === agentId);
  }

  generateReport(agentId: string): string {
    const trace = this.traces.find((t) => t.agentId === agentId);
    if (!trace) return 'Trace not found';

    const report = [
      `Agent Execution Report: ${agentId}`,
      `Total steps: ${trace.steps.length}`,
      `Duration: ${trace.steps.length > 0 ? new Date(trace.steps[trace.steps.length - 1].timestamp).getTime() - new Date(trace.steps[0].timestamp).getTime()}ms`,
      '',
      'Step-by-step:',
      ...trace.steps.map((step, i) => `  ${i + 1}. [${step.timestamp.toISOString()}] ${step.action}`),
    ];

    return report.join('\n');
  }
}

const tracer = new AgentTracer();
const agentId = tracer.startTrace('agent_001');
tracer.recordStep(agentId, 'initialize', { model: 'gpt-4' });
tracer.recordStep(agentId, 'tool_call', { toolName: 'search', args: { query: 'test' } });

console.log(tracer.generateReport(agentId));

Checklist

Define tools with clear descriptions, typed parameters, and validation
Implement parallel execution for independent tool calls
Validate tool outputs before passing to next steps
Set hard limits on steps, tool calls, and execution time
Require human approval for high-risk actions (emails, deletes, transfers)
Track agent state through complete execution
Implement exponential backoff for transient failures
Use circuit breakers for failing services
Trace all agent steps for debugging
Monitor success rate and tool call patterns
Test agent loops with golden datasets
Build observability dashboards for agent performance

Conclusion

Reliable LLM agents require careful engineering: structured tool definitions, parallel execution, result validation, human oversight, and comprehensive observability. Start with basic execution and validation, add human-in-the-loop gates for risky actions, then layer observability for production insights. Test against realistic scenarios before deploying to users.