Pinecone Vector Database — Complete Tutorial

Introduction

Pinecone is a fully managed vector database built for scale. This tutorial covers everything from creating your first index to building production applications with semantic search and filtering.

Getting Started with Pinecone
Setup and Installation
Creating Your First Index
Upserting and Querying Vectors
Basic Upsert
Querying
Metadata and Filtering
Filtering with Metadata
Complex Filters
Batch Operations
Batch Upsert
Batch Delete
Hybrid Search
Managing Indexes at Scale
Index Statistics
Namespaces
Integration with LangChain
Cost Optimization
Strategies
Production Best Practices
Conclusion
FAQ

Getting Started with Pinecone

Setup and Installation

pip install pinecone-client

from pinecone import Pinecone

# Initialize with API key
pc = Pinecone(api_key="your-api-key")

# List existing indexes
print(pc.list_indexes())

Creating Your First Index

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="your-api-key")

# Create a serverless index (recommended for most use cases)
pc.create_index(
    name="products",
    dimension=1536,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

# Get the index
index = pc.Index("products")
print(index.describe_index_stats())

Upserting and Querying Vectors

Basic Upsert

from openai import OpenAI

client = OpenAI()

def get_embedding(text: str):
    """Get OpenAI embedding."""
    response = client.embeddings.create(
        input=text,
        model="text-embedding-3-small"
    )
    return response.data[0].embedding

# Generate embeddings and upsert
products = [
    {"id": "p1", "name": "Laptop", "price": 999},
    {"id": "p2", "name": "Mouse", "price": 29},
    {"id": "p3", "name": "Keyboard", "price": 79},
]

vectors_to_upsert = []
for product in products:
    embedding = get_embedding(product["name"])
    vectors_to_upsert.append((
        product["id"],
        embedding,
        {"name": product["name"], "price": product["price"]}
    ))

index.upsert(vectors=vectors_to_upsert)

Querying

# Query by vector
query_text = "wireless input device"
query_embedding = get_embedding(query_text)

results = index.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True
)

for match in results['matches']:
    print(f"ID: {match['id']}")
    print(f"Score: {match['score']}")
    print(f"Metadata: {match['metadata']}")

Metadata and Filtering

Filtering with Metadata

Pinecone supports filtering using metadata. Create indexes with metadata configuration:

# Query with metadata filter
results = index.query(
    vector=query_embedding,
    top_k=5,
    filter={"price": {"$lt": 100}},  # Price less than 100
    include_metadata=True
)

Complex Filters

# Multiple conditions
results = index.query(
    vector=query_embedding,
    top_k=5,
    filter={
        "$and": [
            {"price": {"$gte": 20}},
            {"price": {"$lt": 100}},
            {"category": {"$eq": "electronics"}}
        ]
    },
    include_metadata=True
)

Batch Operations

Batch Upsert

import time

# Prepare batch
batch = []
for i, product in enumerate(products):
    embedding = get_embedding(product["name"])
    batch.append((
        f"p{i}",
        embedding,
        {"name": product["name"]}
    ))

# Upsert in batches (recommended for large datasets)
batch_size = 100
for i in range(0, len(batch), batch_size):
    index.upsert(vectors=batch[i:i+batch_size])
    time.sleep(1)  # Rate limiting

Batch Delete

# Delete by ID
ids_to_delete = ["p1", "p2"]
index.delete(ids=ids_to_delete)

# Delete by filter
index.delete(filter={"price": {"$lt": 10}})

Hybrid Search

Hybrid search combines vector similarity with keyword matching.

# Pinecone supports hybrid search with keyword filtering
# Use with LangChain for easier integration

from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()
vectorstore = PineconeVectorStore(
    index=index,
    embedding=embeddings,
    text_key="text"
)

# Search with metadata filter
docs = vectorstore.similarity_search_with_score(
    "electronics",
    k=5,
    filter={"price": {"$lt": 500}}
)

Managing Indexes at Scale

Index Statistics

# Get index stats
stats = index.describe_index_stats()
print(f"Dimension: {stats['dimension']}")
print(f"Index fullness: {stats['index_fullness']}")
print(f"Total vector count: {stats['total_vector_count']}")
print(f"Namespaces: {stats['namespaces']}")

Namespaces

Organize data by namespace:

# Upsert to specific namespace
index.upsert(
    vectors=vectors_to_upsert,
    namespace="products-2024"
)

# Query from namespace
results = index.query(
    vector=query_embedding,
    top_k=3,
    namespace="products-2024"
)

Integration with LangChain

from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA

# Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = PineconeVectorStore(
    index=index,
    embedding=embeddings
)

# Create QA chain
qa = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

# Query
answer = qa.run("What products are under 100 dollars?")

Cost Optimization

Strategies

Use appropriate index type: Serverless for variable load, Pod for consistent load
Delete old data: Remove outdated vectors to reduce storage
Batch operations: Combine multiple operations
Monitor usage: Track vector count and queries

# Monitor index stats
stats = index.describe_index_stats()
total_vectors = stats['total_vector_count']
print(f"Storage estimate: {total_vectors * 0.05} cents/month")

Production Best Practices

Use API key rotation: Rotate keys regularly
Implement retry logic: Handle network failures
Monitor latency: Track query performance
Use namespaces: Isolate data by environment
Set TTL metadata: Automatically expire old vectors

# Set metadata with timestamp for expiration
import time

metadata_with_expiry = {
    "text": "product info",
    "created_at": time.time(),
    "ttl": time.time() + 86400 * 30  # 30 days
}

Conclusion

Pinecone provides a production-grade vector database with minimal operational overhead. Its serverless architecture, filtering capabilities, and hybrid search make it ideal for RAG applications at scale.

FAQ

Q: What's the pricing model? A: Pinecone charges per vector per month plus query costs. Serverless is pay-as-you-go; Pod indexes have fixed costs.

Q: Can I export data from Pinecone? A: Yes, you can query all vectors and export them, but there's no built-in bulk export. Plan accordingly.

Q: How does Pinecone handle data redundancy? A: Automatically replicated across availability zones for high availability and durability.