OpenAI Assistants API — Build AI Agents
Advertisement
Introduction
The OpenAI Assistants API enables building AI agents that can interact with external tools, maintain state across conversations, and perform complex workflows. Unlike basic chat completions, Assistants handle conversation state management, tool integration, and multi-step processes automatically. This guide covers building production-ready AI agents using the Assistants API.
- Understanding Assistants vs Chat Completions
- Creating Your First Assistant
- Managing Conversations with Threads
- Implementing Tool Calling
- File Search and Knowledge Base
- Building a Multi-Step Workflow
- Error Handling and Retries
- Cost Considerations
- Practical Examples
- Limitations
- Conclusion
- FAQ
Understanding Assistants vs Chat Completions
The Assistants API handles state and tool calling differently than basic chat completions:
Chat Completions API: You manage conversation history, tool execution, and retry logic manually. Simpler but requires more code.
Assistants API: Manages conversation threads, tool calling, and retry logic automatically. More complex setup but cleaner for multi-turn interactions.
Use Assistants API when building:
- Stateful conversational agents
- Tools that call external APIs or functions
- Applications requiring persistent conversation history
- Complex workflows with multiple steps
Use Chat Completions API when building:
- Simpler applications where stateless is acceptable
- High-volume applications where managed state adds latency
- Simple code generation or analysis tasks
Creating Your First Assistant
An Assistant requires:
- A system prompt (instructions for behavior)
- Optional tools (functions the assistant can call)
- Optional knowledge files (documents to reference)
from openai import OpenAI
client = OpenAI(api_key="your-api-key")
# Step 1: Create an Assistant
assistant = client.beta.assistants.create(
name="Code Review Assistant",
description="Analyzes code for quality and security issues",
instructions="""You are an expert code reviewer. Your job is to:
1. Identify security vulnerabilities
2. Suggest performance improvements
3. Point out violations of best practices
4. Recommend refactoring opportunities
Be constructive and explain your reasoning.""",
model="gpt-4o",
tools=[
{
"type": "function",
"function": {
"name": "get_code_metrics",
"description": "Get complexity metrics for code",
"parameters": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "The code to analyze"
}
},
"required": ["code"]
}
}
}
]
)
print(f"Assistant created: {assistant.id}")
Managing Conversations with Threads
Threads maintain conversation history automatically. Instead of managing messages yourself, you create a thread and add messages to it:
from openai import OpenAI
client = OpenAI()
# Create or retrieve an assistant
assistant_id = "asst_abc123"
# Create a new thread for each conversation
thread = client.beta.threads.create()
# Add a message to the thread
message = client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content="Review this code: def add(a, b): return a+b"
)
# Run the assistant on the thread
run = client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant_id
)
# Wait for completion
import time
while run.status != "completed":
run = client.beta.threads.runs.retrieve(
thread_id=thread.id,
run_id=run.id
)
if run.status == "failed":
print(f"Run failed: {run.last_error}")
break
time.sleep(1)
# Get messages from the thread
messages = client.beta.threads.messages.list(thread_id=thread.id)
for msg in messages.data:
if msg.role == "assistant":
print(f"Assistant: {msg.content[0].text}")
Implementing Tool Calling
When an Assistant needs to call a tool, it returns a tool_calls run status. You must execute the tool and submit the results:
import json
def get_weather(location: str) -> str:
# Simulate weather API call
weather_data = {
"New York": "72°F, Sunny",
"Los Angeles": "85°F, Clear",
"London": "55°F, Rainy"
}
return weather_data.get(location, "Unknown location")
def handle_tool_calls(tool_calls):
"""Execute tools and return results"""
results = []
for tool_call in tool_calls:
if tool_call.function.name == "get_weather":
args = json.loads(tool_call.function.arguments)
result = get_weather(args.get("location"))
results.append({
"tool_call_id": tool_call.id,
"output": result
})
return results
def run_agent_with_tools(thread_id, assistant_id, user_message):
"""Run assistant and handle tool calls"""
client = OpenAI()
# Add user message
client.beta.threads.messages.create(
thread_id=thread_id,
role="user",
content=user_message
)
# Start run
run = client.beta.threads.runs.create(
thread_id=thread_id,
assistant_id=assistant_id
)
# Handle tool calling loop
while run.status in ["queued", "in_progress", "requires_action"]:
if run.status == "requires_action":
# Execute required tools
tool_results = handle_tool_calls(
run.required_action.submit_tool_outputs.tool_calls
)
# Submit tool results
run = client.beta.threads.runs.submit_tool_outputs(
thread_id=thread_id,
run_id=run.id,
tool_outputs=tool_results
)
else:
# Wait for next status
import time
time.sleep(1)
run = client.beta.threads.runs.retrieve(
thread_id=thread_id,
run_id=run.id
)
# Get final response
messages = client.beta.threads.messages.list(thread_id=thread_id)
return messages.data[0].content[0].text
# Usage
thread = client.beta.threads.create()
response = run_agent_with_tools(
thread.id,
"asst_abc123",
"What's the weather in New York?"
)
print(response)
File Search and Knowledge Base
Assistants can search through files you provide:
from openai import OpenAI
client = OpenAI()
# Upload a file
file = client.files.create(
file=open("documentation.pdf", "rb"),
purpose="assistants"
)
# Create assistant with file search
assistant = client.beta.assistants.create(
name="Documentation Assistant",
model="gpt-4o",
instructions="Answer questions based on the provided documentation.",
tools=[{"type": "file_search"}],
tool_resources={
"file_search": {
"vector_store_ids": ["vs_abc123"]
}
}
)
# Now the assistant can search documents when answering questions
Building a Multi-Step Workflow
Combine tools to create complex workflows:
def create_data_analysis_assistant():
"""Create assistant for multi-step data analysis"""
client = OpenAI()
return client.beta.assistants.create(
name="Data Analysis Agent",
model="gpt-4o",
instructions="""You are a data analyst. When given a dataset:
1. Use load_dataset to retrieve the data
2. Use analyze_data to compute statistics
3. Use generate_report to create a summary
4. Provide insights and recommendations""",
tools=[
{
"type": "function",
"function": {
"name": "load_dataset",
"description": "Load a dataset",
"parameters": {
"type": "object",
"properties": {
"file_path": {"type": "string"}
},
"required": ["file_path"]
}
}
},
{
"type": "function",
"function": {
"name": "analyze_data",
"description": "Analyze data and return statistics",
"parameters": {
"type": "object",
"properties": {
"data_id": {"type": "string"}
},
"required": ["data_id"]
}
}
},
{
"type": "function",
"function": {
"name": "generate_report",
"description": "Generate analysis report",
"parameters": {
"type": "object",
"properties": {
"analysis_results": {"type": "string"}
},
"required": ["analysis_results"]
}
}
}
]
)
Error Handling and Retries
Production Assistants require robust error handling:
def run_with_retry(thread_id, assistant_id, max_retries=3):
"""Run assistant with retry logic"""
client = OpenAI()
for attempt in range(max_retries):
try:
run = client.beta.threads.runs.create(
thread_id=thread_id,
assistant_id=assistant_id
)
# Wait for completion with timeout
import time
start_time = time.time()
timeout = 120 # 2 minutes
while run.status != "completed":
if time.time() - start_time > timeout:
raise TimeoutError("Assistant run timed out")
if run.status == "failed":
raise Exception(f"Run failed: {run.last_error}")
time.sleep(1)
run = client.beta.threads.runs.retrieve(
thread_id=thread_id,
run_id=run.id
)
# Success
messages = client.beta.threads.messages.list(
thread_id=thread_id
)
return messages.data[0].content[0].text
except Exception as e:
if attempt < max_retries - 1:
print(f"Attempt {attempt + 1} failed: {e}. Retrying...")
time.sleep(2 ** attempt) # Exponential backoff
else:
raise
Cost Considerations
Assistants API charges per token like the chat completion API, but adds costs for:
File storage: $0.20 per GB per day
Vector store usage: For file search, managed vector stores cost $0.10 per GB per day
API calls: Standard per-token charges apply
To optimize costs:
- Delete unused files and vector stores
- Reuse threads instead of creating new ones for related conversations
- Use cheaper models when possible
- Batch operations
Practical Examples
Customer Support Bot: Create an assistant trained on your documentation that answers customer questions and escalates to humans when needed.
Code Analysis Tool: Build an agent that analyzes repositories, identifies issues, and suggests improvements.
Research Assistant: Create an agent that searches papers, summarizes findings, and synthesizes conclusions.
Limitations
- Tools are called one at a time (no parallel execution)
- No streaming responses yet (responses only available after completion)
- File search has a 20MB file size limit per file
- Threads can grow large if not managed
Conclusion
The Assistants API enables building sophisticated AI agents that maintain state, call tools, and handle complex workflows. It's the right choice for production conversational applications where reliability and state management matter. Start simple with basic assistants, then add tools and complexity as your use case demands.
FAQ
Q: Should I use Assistants API or Chat Completions API? A: Use Assistants for stateful multi-turn conversations. Use Chat Completions for simpler, stateless interactions. Assistants handle complexity but add latency.
Q: How do I delete a thread to save costs? A: Call client.beta.threads.delete(thread_id=thread.id). Deleting frees resources but you lose conversation history.
Q: Can Assistants handle real-time interactions? A: Assistants aren't ideal for real-time (sub-second response) applications due to latency. Use Chat Completions API for that.
Advertisement