Perplexity API Guide for Developers

Sanjeev SharmaSanjeev Sharma
6 min read

Advertisement

Introduction

The Perplexity API enables developers to integrate real-time search capabilities with AI reasoning into their applications. Unlike ChatGPT or Claude APIs that use training data, Perplexity API fetches current web information and synthesizes it with reasoning. This guide covers integration, use cases, and best practices for the Perplexity API.

API Access Requirements

Perplexity API requires:

  1. Pro subscription ($20/month minimum)
  2. API key generation from account settings
  3. Billing information on file
  4. Rate limits (vary by plan)

Get started at perplexity.ai/settings/api

Installation

# Python SDK (if available)
pip install perplexity-sdk

# Or use requests directly
pip install requests

Authentication

import requests

API_KEY = "your-api-key-here"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

Basic API Call

Simple request for current information:

import requests

def search_with_perplexity(query):
    """Get current information on a topic"""
    url = "https://api.perplexity.ai/chat/completions"

    payload = {
        "model": "pplx-7b-online",  # Online model with web search
        "messages": [
            {
                "role": "user",
                "content": query
            }
        ]
    }

    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    }

    response = requests.post(url, json=payload, headers=headers)
    return response.json()

# Usage
result = search_with_perplexity("What are the latest AI developments?")
print(result['choices'][0]['message']['content'])

Model Options

pplx-7b-online: Search-enabled model, current information

pplx-70b-online: Larger model with search, better reasoning

pplx-7b-chat: No search (faster, cheaper)

pplx-70b-chat: Larger, no search

For most use cases, use the online model variant.

Search Focus

Narrow search scope for better results:

def search_with_focus(query, search_domain="general"):
    """
    search_domain options:
    - "general": All web content
    - "news": News articles only
    - "academic": Peer-reviewed papers
    - "reddit": Reddit discussions
    - "twitter": Twitter/X posts
    - "youtube": YouTube videos
    """

    payload = {
        "model": "pplx-7b-online",
        "messages": [
            {
                "role": "user",
                "content": query
            }
        ],
        "search_domain": search_domain
    }

    response = requests.post(
        "https://api.perplexity.ai/chat/completions",
        json=payload,
        headers=headers
    )

    return response.json()

# Academic search example
academic = search_with_focus(
    "Latest research on CRISPR gene editing",
    search_domain="academic"
)

Conversation Context

Multi-turn searches maintain conversation:

def multi_turn_search():
    """Maintain conversation context"""
    messages = []

    # First turn
    query1 = "What's the current state of quantum computing?"
    messages.append({"role": "user", "content": query1})

    response1 = requests.post(
        "https://api.perplexity.ai/chat/completions",
        json={
            "model": "pplx-7b-online",
            "messages": messages
        },
        headers=headers
    ).json()

    answer1 = response1['choices'][0]['message']['content']
    messages.append({"role": "assistant", "content": answer1})

    # Follow-up turn
    query2 = "Which companies are leading in quantum computing?"
    messages.append({"role": "user", "content": query2})

    response2 = requests.post(
        "https://api.perplexity.ai/chat/completions",
        json={
            "model": "pplx-7b-online",
            "messages": messages
        },
        headers=headers
    ).json()

    answer2 = response2['choices'][0]['message']['content']

    return answer1, answer2

Streaming Responses

For real-time results:

import requests

def stream_search(query):
    """Stream results as they arrive"""
    payload = {
        "model": "pplx-7b-online",
        "messages": [
            {"role": "user", "content": query}
        ],
        "stream": True
    }

    response = requests.post(
        "https://api.perplexity.ai/chat/completions",
        json=payload,
        headers=headers,
        stream=True
    )

    for line in response.iter_lines():
        if line:
            data = line.decode('utf-8')
            if data.startswith("data: "):
                chunk = data[6:]  # Remove "data: " prefix
                if chunk:
                    print(chunk, end="", flush=True)

Extracting Citations

Get sources for provided information:

def search_with_citations(query):
    """Extract citations and sources"""
    response = requests.post(
        "https://api.perplexity.ai/chat/completions",
        json={
            "model": "pplx-7b-online",
            "messages": [{"role": "user", "content": query}]
        },
        headers=headers
    ).json()

    answer = response['choices'][0]['message']['content']

    # Extract citations if available
    citations = []
    if 'citations' in response:
        citations = response['citations']

    return {
        "answer": answer,
        "sources": citations
    }

# Usage
result = search_with_citations("Latest Node.js features")
print(f"Answer: {result['answer']}")
for source in result['sources']:
    print(f"Source: {source}")

Error Handling

Robust error handling for production:

import time
from requests.exceptions import RequestException

def search_with_retry(query, max_retries=3):
    """Search with retry logic"""
    for attempt in range(max_retries):
        try:
            response = requests.post(
                "https://api.perplexity.ai/chat/completions",
                json={
                    "model": "pplx-7b-online",
                    "messages": [{"role": "user", "content": query}]
                },
                headers=headers,
                timeout=30
            )

            if response.status_code == 200:
                return response.json()

            elif response.status_code == 429:  # Rate limited
                wait_time = 2 ** attempt
                print(f"Rate limited. Waiting {wait_time}s...")
                time.sleep(wait_time)

            elif response.status_code == 401:  # Auth error
                raise ValueError("Invalid API key")

            else:
                print(f"Error: {response.status_code}")
                raise RequestException(response.text)

        except RequestException as e:
            if attempt < max_retries - 1:
                print(f"Connection error, retrying...")
                time.sleep(2 ** attempt)
            else:
                raise

    return None

Use Cases

Real-Time News Aggregation

def get_news_summary(topic):
    """Get current news summary"""
    return search_with_focus(
        f"Latest news about {topic}",
        search_domain="news"
    )

news = get_news_summary("AI regulation")

Academic Research

def find_research(topic):
    """Find academic papers"""
    return search_with_focus(
        f"Recent research on {topic}",
        search_domain="academic"
    )

research = find_research("quantum computing")

Community Insights

def get_community_opinion(topic):
    """Get discussions from Reddit"""
    return search_with_focus(
        f"Opinions and discussions about {topic}",
        search_domain="reddit"
    )

opinions = get_community_opinion("best web framework")

Building a Search Application

from flask import Flask, request, jsonify
import requests

app = Flask(__name__)

@app.route('/api/search', methods=['POST'])
def search():
    """REST endpoint for searching"""
    data = request.json
    query = data.get('query')
    domain = data.get('domain', 'general')

    try:
        result = search_with_focus(query, domain)

        return jsonify({
            "success": True,
            "answer": result['choices'][0]['message']['content'],
            "citations": result.get('citations', [])
        })

    except Exception as e:
        return jsonify({
            "success": False,
            "error": str(e)
        }), 500

if __name__ == '__main__':
    app.run(debug=True)

Cost Optimization

Strategies to manage API costs:

Cache Results: Don't repeat same searches

from functools import lru_cache

@lru_cache(maxsize=100)
def search_cached(query):
    """Cache search results"""
    return search_with_perplexity(query)

Use Appropriate Model: pplx-7b-online cheaper than pplx-70b-online

Batch Queries: Process multiple searches in batches

def batch_search(queries):
    """Search multiple queries efficiently"""
    results = []
    for query in queries:
        result = search_with_perplexity(query)
        results.append(result)
    return results

Rate Limiting

Handle rate limits gracefully:

from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=4, max=10)
)
def search_with_rate_limit(query):
    """Search with rate limit handling"""
    response = requests.post(
        "https://api.perplexity.ai/chat/completions",
        json={
            "model": "pplx-7b-online",
            "messages": [{"role": "user", "content": query}]
        },
        headers=headers
    )
    response.raise_for_status()
    return response.json()

Monitoring and Logging

Track API usage:

import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def search_with_logging(query):
    """Search with logging"""
    logger.info(f"Searching: {query}")

    try:
        result = search_with_perplexity(query)
        logger.info(f"Search successful")
        return result

    except Exception as e:
        logger.error(f"Search failed: {str(e)}")
        raise

Limitations

  • Knowledge cutoff: Uses current web, not guaranteed 100% accuracy
  • Rate limits: Depends on plan
  • Cost: More expensive than knowledge base models
  • Latency: Web search adds latency compared to local inference

When to Use Perplexity API

Best for:

  • Current events and news
  • Real-time market data
  • Latest technology developments
  • Recent research findings

Not ideal for:

  • Private or historical data
  • Low-latency applications
  • High-volume queries (expensive)
  • Information with many hallucinations risk

Conclusion

The Perplexity API bridges the gap between current information and AI reasoning. For applications requiring real-time, accurate information with sources, Perplexity API is a powerful solution. Use it strategically where current information is critical and performance isn't ultra-time-sensitive.

FAQ

Q: How much does Perplexity API cost? A: Pricing varies by plan and model. Generally $0.01-0.10 per query depending on model and usage.

Q: Can I use Perplexity API without Pro subscription? A: No, Pro subscription is required for API access currently.

Q: How accurate are Perplexity API results? A: Generally accurate for current information, but always verify important data by checking provided sources.

Advertisement

Sanjeev Sharma

Written by

Sanjeev Sharma

Full Stack Engineer · E-mopro