Perplexity API Guide for Developers
Advertisement
Introduction
The Perplexity API enables developers to integrate real-time search capabilities with AI reasoning into their applications. Unlike ChatGPT or Claude APIs that use training data, Perplexity API fetches current web information and synthesizes it with reasoning. This guide covers integration, use cases, and best practices for the Perplexity API.
- API Access Requirements
- Installation
- Authentication
- Basic API Call
- Model Options
- Search Focus
- Conversation Context
- Streaming Responses
- Extracting Citations
- Error Handling
- Use Cases
- Building a Search Application
- Cost Optimization
- Rate Limiting
- Monitoring and Logging
- Limitations
- When to Use Perplexity API
- Conclusion
- FAQ
API Access Requirements
Perplexity API requires:
- Pro subscription ($20/month minimum)
- API key generation from account settings
- Billing information on file
- Rate limits (vary by plan)
Get started at perplexity.ai/settings/api
Installation
# Python SDK (if available)
pip install perplexity-sdk
# Or use requests directly
pip install requests
Authentication
import requests
API_KEY = "your-api-key-here"
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
Basic API Call
Simple request for current information:
import requests
def search_with_perplexity(query):
"""Get current information on a topic"""
url = "https://api.perplexity.ai/chat/completions"
payload = {
"model": "pplx-7b-online", # Online model with web search
"messages": [
{
"role": "user",
"content": query
}
]
}
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
response = requests.post(url, json=payload, headers=headers)
return response.json()
# Usage
result = search_with_perplexity("What are the latest AI developments?")
print(result['choices'][0]['message']['content'])
Model Options
pplx-7b-online: Search-enabled model, current information
pplx-70b-online: Larger model with search, better reasoning
pplx-7b-chat: No search (faster, cheaper)
pplx-70b-chat: Larger, no search
For most use cases, use the online model variant.
Search Focus
Narrow search scope for better results:
def search_with_focus(query, search_domain="general"):
"""
search_domain options:
- "general": All web content
- "news": News articles only
- "academic": Peer-reviewed papers
- "reddit": Reddit discussions
- "twitter": Twitter/X posts
- "youtube": YouTube videos
"""
payload = {
"model": "pplx-7b-online",
"messages": [
{
"role": "user",
"content": query
}
],
"search_domain": search_domain
}
response = requests.post(
"https://api.perplexity.ai/chat/completions",
json=payload,
headers=headers
)
return response.json()
# Academic search example
academic = search_with_focus(
"Latest research on CRISPR gene editing",
search_domain="academic"
)
Conversation Context
Multi-turn searches maintain conversation:
def multi_turn_search():
"""Maintain conversation context"""
messages = []
# First turn
query1 = "What's the current state of quantum computing?"
messages.append({"role": "user", "content": query1})
response1 = requests.post(
"https://api.perplexity.ai/chat/completions",
json={
"model": "pplx-7b-online",
"messages": messages
},
headers=headers
).json()
answer1 = response1['choices'][0]['message']['content']
messages.append({"role": "assistant", "content": answer1})
# Follow-up turn
query2 = "Which companies are leading in quantum computing?"
messages.append({"role": "user", "content": query2})
response2 = requests.post(
"https://api.perplexity.ai/chat/completions",
json={
"model": "pplx-7b-online",
"messages": messages
},
headers=headers
).json()
answer2 = response2['choices'][0]['message']['content']
return answer1, answer2
Streaming Responses
For real-time results:
import requests
def stream_search(query):
"""Stream results as they arrive"""
payload = {
"model": "pplx-7b-online",
"messages": [
{"role": "user", "content": query}
],
"stream": True
}
response = requests.post(
"https://api.perplexity.ai/chat/completions",
json=payload,
headers=headers,
stream=True
)
for line in response.iter_lines():
if line:
data = line.decode('utf-8')
if data.startswith("data: "):
chunk = data[6:] # Remove "data: " prefix
if chunk:
print(chunk, end="", flush=True)
Extracting Citations
Get sources for provided information:
def search_with_citations(query):
"""Extract citations and sources"""
response = requests.post(
"https://api.perplexity.ai/chat/completions",
json={
"model": "pplx-7b-online",
"messages": [{"role": "user", "content": query}]
},
headers=headers
).json()
answer = response['choices'][0]['message']['content']
# Extract citations if available
citations = []
if 'citations' in response:
citations = response['citations']
return {
"answer": answer,
"sources": citations
}
# Usage
result = search_with_citations("Latest Node.js features")
print(f"Answer: {result['answer']}")
for source in result['sources']:
print(f"Source: {source}")
Error Handling
Robust error handling for production:
import time
from requests.exceptions import RequestException
def search_with_retry(query, max_retries=3):
"""Search with retry logic"""
for attempt in range(max_retries):
try:
response = requests.post(
"https://api.perplexity.ai/chat/completions",
json={
"model": "pplx-7b-online",
"messages": [{"role": "user", "content": query}]
},
headers=headers,
timeout=30
)
if response.status_code == 200:
return response.json()
elif response.status_code == 429: # Rate limited
wait_time = 2 ** attempt
print(f"Rate limited. Waiting {wait_time}s...")
time.sleep(wait_time)
elif response.status_code == 401: # Auth error
raise ValueError("Invalid API key")
else:
print(f"Error: {response.status_code}")
raise RequestException(response.text)
except RequestException as e:
if attempt < max_retries - 1:
print(f"Connection error, retrying...")
time.sleep(2 ** attempt)
else:
raise
return None
Use Cases
Real-Time News Aggregation
def get_news_summary(topic):
"""Get current news summary"""
return search_with_focus(
f"Latest news about {topic}",
search_domain="news"
)
news = get_news_summary("AI regulation")
Academic Research
def find_research(topic):
"""Find academic papers"""
return search_with_focus(
f"Recent research on {topic}",
search_domain="academic"
)
research = find_research("quantum computing")
Community Insights
def get_community_opinion(topic):
"""Get discussions from Reddit"""
return search_with_focus(
f"Opinions and discussions about {topic}",
search_domain="reddit"
)
opinions = get_community_opinion("best web framework")
Building a Search Application
from flask import Flask, request, jsonify
import requests
app = Flask(__name__)
@app.route('/api/search', methods=['POST'])
def search():
"""REST endpoint for searching"""
data = request.json
query = data.get('query')
domain = data.get('domain', 'general')
try:
result = search_with_focus(query, domain)
return jsonify({
"success": True,
"answer": result['choices'][0]['message']['content'],
"citations": result.get('citations', [])
})
except Exception as e:
return jsonify({
"success": False,
"error": str(e)
}), 500
if __name__ == '__main__':
app.run(debug=True)
Cost Optimization
Strategies to manage API costs:
Cache Results: Don't repeat same searches
from functools import lru_cache
@lru_cache(maxsize=100)
def search_cached(query):
"""Cache search results"""
return search_with_perplexity(query)
Use Appropriate Model: pplx-7b-online cheaper than pplx-70b-online
Batch Queries: Process multiple searches in batches
def batch_search(queries):
"""Search multiple queries efficiently"""
results = []
for query in queries:
result = search_with_perplexity(query)
results.append(result)
return results
Rate Limiting
Handle rate limits gracefully:
from tenacity import retry, stop_after_attempt, wait_exponential
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=4, max=10)
)
def search_with_rate_limit(query):
"""Search with rate limit handling"""
response = requests.post(
"https://api.perplexity.ai/chat/completions",
json={
"model": "pplx-7b-online",
"messages": [{"role": "user", "content": query}]
},
headers=headers
)
response.raise_for_status()
return response.json()
Monitoring and Logging
Track API usage:
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def search_with_logging(query):
"""Search with logging"""
logger.info(f"Searching: {query}")
try:
result = search_with_perplexity(query)
logger.info(f"Search successful")
return result
except Exception as e:
logger.error(f"Search failed: {str(e)}")
raise
Limitations
- Knowledge cutoff: Uses current web, not guaranteed 100% accuracy
- Rate limits: Depends on plan
- Cost: More expensive than knowledge base models
- Latency: Web search adds latency compared to local inference
When to Use Perplexity API
Best for:
- Current events and news
- Real-time market data
- Latest technology developments
- Recent research findings
Not ideal for:
- Private or historical data
- Low-latency applications
- High-volume queries (expensive)
- Information with many hallucinations risk
Conclusion
The Perplexity API bridges the gap between current information and AI reasoning. For applications requiring real-time, accurate information with sources, Perplexity API is a powerful solution. Use it strategically where current information is critical and performance isn't ultra-time-sensitive.
FAQ
Q: How much does Perplexity API cost? A: Pricing varies by plan and model. Generally $0.01-0.10 per query depending on model and usage.
Q: Can I use Perplexity API without Pro subscription? A: No, Pro subscription is required for API access currently.
Q: How accurate are Perplexity API results? A: Generally accurate for current information, but always verify important data by checking provided sources.
Advertisement