Vercel AI SDK Deep Dive — Building Production AI Features in Next.js

Introduction

The Vercel AI SDK simplifies building AI-powered applications in Next.js. It provides primitives for generating text, streaming responses, managing conversation state, and calling tools. This post covers the SDK''s architecture, provider switching, multi-step tool use, and production patterns for reliable AI features.

AI SDK Core vs UI vs RSC
generateText vs streamText vs generateObject
Tool Calling With tool() Helper
Multi-Step Tool Use With maxSteps
useChat Hook for Conversation Management
Streaming UI With streamUI (RSC)
Provider Switching
Middleware for Logging and Caching
Structured Output With Zod Schemas
Error Handling and Retries
Rate Limiting With AI SDK Middleware
Building a Complete Chat Application
Checklist
Conclusion

AI SDK Core vs UI vs RSC

The Vercel AI SDK consists of three layers:

Core: LLM abstraction for generateText, streamText, and generateObject UI: React hooks like useChat, useCompletion, useObject RSC: Server Components support for streaming responses

import {
  generateText,
  streamText,
  generateObject,
} from 'ai';
import { openai } from '@ai-sdk/openai';
import { anthropic } from '@ai-sdk/anthropic';

// Core: Generate plain text
async function generateTextExample() {
  const { text } = await generateText({
    model: openai('gpt-4o'),
    prompt: 'Explain quantum computing in one sentence.',
  });

  return text;
}

// Core: Stream text tokens
async function streamTextExample() {
  const { textStream } = await streamText({
    model: anthropic('claude-opus-4-1'),
    prompt: 'Write a blog post about AI agents.',
  });

  for await (const chunk of textStream) {
    console.log(chunk);
  }
}

// Core: Generate structured output
async function generateObjectExample() {
  const { object } = await generateObject({
    model: openai('gpt-4o'),
    schema: z.object({
      title: z.string(),
      sections: z.array(z.object({ heading: z.string(), content: z.string() })),
    }),
    prompt: 'Create an outline for a blog post about LLMs.',
  });

  return object;
}

`generateText` vs `streamText` vs `generateObject`

Choose based on your use case:

generateText: Simple prompts, no streaming, quick responses

async function classifyEmail(email: string): Promise<'spam' | 'important' | 'other'> {
  const { text } = await generateText({
    model: openai('gpt-3.5-turbo'),
    system: 'Classify emails as spam, important, or other.',
    prompt: email,
  });

  return text as 'spam' | 'important' | 'other';
}

streamText: Long responses, real-time tokens, streaming to client

async function generateArticle(topic: string) {
  return await streamText({
    model: openai('gpt-4o'),
    system: 'Write engaging technical articles.',
    prompt: `Write a comprehensive article about: ${topic}`,
  });
}

// In API route
export async function POST(req: Request) {
  const { topic } = await req.json();
  const stream = await generateArticle(topic);
  return stream.toDataStreamResponse();
}

generateObject: Structured output, validation, parsed results

import { z } from 'zod';

const productSchema = z.object({
  name: z.string(),
  price: z.number(),
  description: z.string(),
  inStock: z.boolean(),
});

async function extractProductInfo(productText: string) {
  const { object } = await generateObject({
    model: openai('gpt-4o'),
    schema: productSchema,
    prompt: `Extract product information from: ${productText}`,
  });

  return object;
}

Tool Calling With `tool()` Helper

Define tools that the model can invoke:

import { tool } from 'ai';
import { z } from 'zod';

const searchTool = tool({
  description: 'Search the web for information',
  parameters: z.object({
    query: z.string().describe('Search query'),
  }),
  execute: async ({ query }) => {
    const results = await fetch(`https://api.search.com?q=${encodeURIComponent(query)}`);
    return results.json();
  },
});

const calculatorTool = tool({
  description: 'Perform mathematical calculations',
  parameters: z.object({
    expression: z.string().describe('Math expression to evaluate'),
  }),
  execute: async ({ expression }) => {
    return eval(expression);
  },
});

async function respondWithTools(userMessage: string) {
  const { text } = await generateText({
    model: openai('gpt-4o'),
    tools: {
      search: searchTool,
      calculate: calculatorTool,
    },
    prompt: userMessage,
  });

  return text;
}

Multi-Step Tool Use With `maxSteps`

Allow models to chain tool calls:

async function executeComplexTask(task: string) {
  const { text, toolResults } = await generateText({
    model: openai('gpt-4o'),
    tools: {
      search: searchTool,
      calculate: calculatorTool,
      summarise: tool({
        description: 'Summarise text',
        parameters: z.object({ text: z.string() }),
        execute: async ({ text }) => {
          const { text: summary } = await generateText({
            model: openai('gpt-3.5-turbo'),
            prompt: `Summarise: ${text}`,
          });
          return summary;
        },
      }),
    },
    prompt: task,
    maxSteps: 5, // Allow up to 5 tool calls
  });

  console.log('Final response:', text);
  console.log('Tool calls made:', toolResults.length);

  return text;
}

`useChat` Hook for Conversation Management

Build chat interfaces with state management:

'use client';

import { useChat } from 'ai/react';

export function ChatComponent() {
  const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
    api: '/api/chat',
    onError: (error) => {
      console.error('Chat error:', error);
    },
  });

  return (
    &lt;div&gt;
      &lt;div className="messages"&gt;
        {messages.map((message) => (
          &lt;div key={message.id} className={`message ${message.role}`}&gt;
            {message.content}
          &lt;/div&gt;
        ))}
      &lt;/div&gt;

      &lt;form onSubmit={handleSubmit}&gt;
        &lt;input
          value={input}
          onChange={handleInputChange}
          placeholder="Say something..."
          disabled={isLoading}
        /&gt;
        &lt;button type="submit" disabled={isLoading}&gt;
          Send
        &lt;/button&gt;
      &lt;/form&gt;
    &lt;/div&gt;
  );
}

Server-side handler:

import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await streamText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
  });

  return result.toDataStreamResponse();
}

Streaming UI With `streamUI` (RSC)

Build streaming interfaces with React Server Components:

import { streamUI } from 'ai/rsc';
import { openai } from '@ai-sdk/openai';

export async function generateUI(userMessage: string) {
  const { output } = await streamUI({
    model: openai('gpt-4o'),
    prompt: userMessage,
    text: ({ content }) => &lt;p&gt;{content}&lt;/p&gt;,
    tools: {
      showChart: tool({
        description: 'Display a chart',
        parameters: z.object({
          type: z.enum(['bar', 'line', 'pie']),
          data: z.array(z.object({ label: z.string(), value: z.number() })),
        }),
        generate: async function* ({ type, data }) {
          yield &lt;p&gt;Generating {type} chart...&lt;/p&gt;;

          // Simulate async operation
          await new Promise((resolve) => setTimeout(resolve, 1000));

          yield (
            &lt;div className="chart"&gt;
              {type === 'bar' &amp;&amp; &lt;BarChart data={data} /&gt;}
              {type === 'line' &amp;&amp; &lt;LineChart data={data} /&gt;}
              {type === 'pie' &amp;&amp; &lt;PieChart data={data} /&gt;}
            &lt;/div&gt;
          );
        },
      }),
    },
  });

  return output;
}

Provider Switching

Switch between OpenAI, Anthropic, Google, and more:

const providers = {
  openai: openai('gpt-4o'),
  anthropic: anthropic('claude-opus-4-1'),
  google: google('gemini-2.0-flash'),
};

async function generateWithProvider(
  provider: keyof typeof providers,
  prompt: string
) {
  const model = providers[provider];

  const { text } = await generateText({
    model,
    prompt,
  });

  return text;
}

// Usage
const response = await generateWithProvider('anthropic', 'Explain quantum computing.');

Cost-based provider selection:

interface ProviderCost {
  costPer1mInputTokens: number;
  costPer1mOutputTokens: number;
}

const costs: Record<string, ProviderCost> = {
  openai: { costPer1mInputTokens: 5, costPer1mOutputTokens: 15 },
  anthropic: { costPer1mInputTokens: 3, costPer1mOutputTokens: 15 },
  google: { costPer1mInputTokens: 0.5, costPer1mOutputTokens: 1.5 },
};

function selectCheapestProvider(estimatedInputTokens: number): string {
  let cheapest = 'google';
  let lowestCost = Infinity;

  for (const [provider, pricing] of Object.entries(costs)) {
    const cost =
      (estimatedInputTokens * pricing.costPer1mInputTokens) / 1000000;

    if (cost &lt; lowestCost) {
      lowestCost = cost;
      cheapest = provider;
    }
  }

  return cheapest;
}

Middleware for Logging and Caching

Add cross-cutting concerns:

import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';

interface Middleware {
  name: string;
  before?: (params: any) => void;
  after?: (result: any) => void;
}

const loggingMiddleware: Middleware = {
  name: 'logging',
  before: (params) => {
    console.log('Calling model:', params.model);
  },
  after: (result) => {
    console.log('Generated:', result.text.substring(0, 100) + '...');
  },
};

const cachingMiddleware: Middleware = {
  name: 'caching',
  before: (params) => {
    const cached = cache.get(params.prompt);
    if (cached) {
      return cached;
    }
  },
  after: (result) => {
    cache.set(result.prompt, result);
  },
};

async function generateTextWithMiddleware(
  prompt: string,
  middlewares: Middleware[]
) {
  // Execute before middleware
  for (const mw of middlewares) {
    mw.before?.({ prompt });
  }

  const result = await generateText({
    model: openai('gpt-4o'),
    prompt,
  });

  // Execute after middleware
  for (const mw of middlewares) {
    mw.after?.(result);
  }

  return result;
}

Structured Output With Zod Schemas

Validate and type responses:

import { z } from 'zod';

const userSchema = z.object({
  name: z.string().min(1),
  email: z.string().email(),
  age: z.number().int().min(0).max(150),
  interests: z.array(z.string()),
});

async function extractUserInfo(text: string) {
  const { object, finishReason } = await generateObject({
    model: openai('gpt-4o'),
    schema: userSchema,
    prompt: `Extract user information from: ${text}`,
  });

  if (finishReason === 'error') {
    throw new Error('Failed to extract user info');
  }

  return object;
}

// Type-safe result
const user = await extractUserInfo('My name is John, john@example.com, 30 years old');
console.log(user.email); // TypeScript knows this is a string

Error Handling and Retries

Graceful error handling with retry logic:

async function generateWithRetry(
  prompt: string,
  maxRetries: number = 3
): Promise<string> {
  for (let attempt = 1; attempt &lt;= maxRetries; attempt++) {
    try {
      const { text } = await generateText({
        model: openai('gpt-4o'),
        prompt,
      });

      return text;
    } catch (error) {
      if (attempt === maxRetries) {
        throw error;
      }

      const backoffMs = Math.pow(2, attempt - 1) * 1000;
      console.log(`Attempt ${attempt} failed. Retrying in ${backoffMs}ms...`);

      await new Promise((resolve) => setTimeout(resolve, backoffMs));
    }
  }

  throw new Error('All retry attempts exhausted');
}

// In API route
export async function POST(req: Request) {
  try {
    const { prompt } = await req.json();
    const text = await generateWithRetry(prompt);

    return Response.json({ text });
  } catch (error) {
    return Response.json(
      { error: (error as Error).message },
      { status: 500 }
    );
  }
}

Rate Limiting With AI SDK Middleware

Prevent abuse and manage quotas:

interface RateLimit {
  tokensPerMinute: number;
  requestsPerMinute: number;
}

class RateLimiter {
  private tokenUsage: number = 0;
  private requestCount: number = 0;
  private lastReset: Date = new Date();

  constructor(private limits: RateLimit) {}

  canMakeRequest(estimatedTokens: number): boolean {
    const now = new Date();
    const minutesElapsed =
      (now.getTime() - this.lastReset.getTime()) / (1000 * 60);

    if (minutesElapsed &gt; 1) {
      this.tokenUsage = 0;
      this.requestCount = 0;
      this.lastReset = now;
    }

    return (
      this.tokenUsage + estimatedTokens &lt;= this.limits.tokensPerMinute &amp;&amp;
      this.requestCount &lt; this.limits.requestsPerMinute
    );
  }

  recordUsage(tokens: number): void {
    this.tokenUsage += tokens;
    this.requestCount += 1;
  }
}

const limiter = new RateLimiter({
  tokensPerMinute: 10000,
  requestsPerMinute: 60,
});

async function generateTextWithRateLimit(prompt: string) {
  const estimatedTokens = Math.ceil(prompt.length / 4);

  if (!limiter.canMakeRequest(estimatedTokens)) {
    throw new Error('Rate limit exceeded');
  }

  const { text, usage } = await generateText({
    model: openai('gpt-4o'),
    prompt,
  });

  limiter.recordUsage(
    (usage?.inputTokens || 0) + (usage?.outputTokens || 0)
  );

  return text;
}

Building a Complete Chat Application

Full-stack chat with streaming and tools:

// app/api/chat/route.ts
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { tool } from 'ai';
import { z } from 'zod';

const searchTool = tool({
  description: 'Search for information',
  parameters: z.object({ query: z.string() }),
  execute: async ({ query }) => {
    return `Search results for "${query}"`;
  },
});

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await streamText({
    model: openai('gpt-4o'),
    system:
      'You are a helpful assistant. Use tools when needed to search for information.',
    messages,
    tools: { search: searchTool },
    maxSteps: 5,
  });

  return result.toDataStreamResponse();
}

// app/components/chat.tsx
'use client';

import { useChat } from 'ai/react';
import { useEffect, useRef } from 'react';

export function Chat() {
  const { messages, input, handleInputChange, handleSubmit, isLoading } =
    useChat();
  const containerRef = useRef&lt;HTMLDivElement&gt;(null);

  useEffect(() => {
    containerRef.current?.scrollIntoView({ behavior: 'smooth' });
  }, [messages]);

  return (
    &lt;div className="flex flex-col h-screen"&gt;
      &lt;div className="flex-1 overflow-auto p-4"&gt;
        {messages.map((message) => (
          &lt;div
            key={message.id}
            className={`mb-4 p-3 rounded ${
              message.role === 'user'
                ? 'bg-blue-500 text-white ml-auto'
                : 'bg-gray-200'
            } max-w-sm`}
          &gt;
            {message.content}
          &lt;/div&gt;
        ))}
        &lt;div ref={containerRef} /&gt;
      &lt;/div&gt;

      &lt;form onSubmit={handleSubmit} className="p-4 border-t"&gt;
        &lt;div className="flex gap-2"&gt;
          &lt;input
            value={input}
            onChange={handleInputChange}
            placeholder="Message..."
            className="flex-1 px-4 py-2 border rounded"
            disabled={isLoading}
          /&gt;
          &lt;button
            type="submit"
            disabled={isLoading}
            className="px-4 py-2 bg-blue-500 text-white rounded disabled:opacity-50"
          &gt;
            Send
          &lt;/button&gt;
        &lt;/div&gt;
      &lt;/form&gt;
    &lt;/div&gt;
  );
}

Checklist

Conclusion

The Vercel AI SDK simplifies building production AI features in Next.js. Start with simple text generation, add streaming for better UX, then introduce tool calling for complex workflows. Use structured output for reliable data extraction, switch providers based on cost or capability, and add middleware for observability. As your AI features grow more sophisticated, the SDK''s abstractions keep implementation complexity manageable.