Pinecone Vector Database — Complete Tutorial

Sanjeev SharmaSanjeev Sharma
4 min read

Advertisement

Introduction

Pinecone is a fully managed vector database built for scale. This tutorial covers everything from creating your first index to building production applications with semantic search and filtering.

Getting Started with Pinecone

Setup and Installation

pip install pinecone-client
from pinecone import Pinecone

# Initialize with API key
pc = Pinecone(api_key="your-api-key")

# List existing indexes
print(pc.list_indexes())

Creating Your First Index

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="your-api-key")

# Create a serverless index (recommended for most use cases)
pc.create_index(
    name="products",
    dimension=1536,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="aws",
        region="us-east-1"
    )
)

# Get the index
index = pc.Index("products")
print(index.describe_index_stats())

Upserting and Querying Vectors

Basic Upsert

from openai import OpenAI

client = OpenAI()

def get_embedding(text: str):
    """Get OpenAI embedding."""
    response = client.embeddings.create(
        input=text,
        model="text-embedding-3-small"
    )
    return response.data[0].embedding

# Generate embeddings and upsert
products = [
    {"id": "p1", "name": "Laptop", "price": 999},
    {"id": "p2", "name": "Mouse", "price": 29},
    {"id": "p3", "name": "Keyboard", "price": 79},
]

vectors_to_upsert = []
for product in products:
    embedding = get_embedding(product["name"])
    vectors_to_upsert.append((
        product["id"],
        embedding,
        {"name": product["name"], "price": product["price"]}
    ))

index.upsert(vectors=vectors_to_upsert)

Querying

# Query by vector
query_text = "wireless input device"
query_embedding = get_embedding(query_text)

results = index.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True
)

for match in results['matches']:
    print(f"ID: {match['id']}")
    print(f"Score: {match['score']}")
    print(f"Metadata: {match['metadata']}")

Metadata and Filtering

Filtering with Metadata

Pinecone supports filtering using metadata. Create indexes with metadata configuration:

# Query with metadata filter
results = index.query(
    vector=query_embedding,
    top_k=5,
    filter={"price": {"$lt": 100}},  # Price less than 100
    include_metadata=True
)

Complex Filters

# Multiple conditions
results = index.query(
    vector=query_embedding,
    top_k=5,
    filter={
        "$and": [
            {"price": {"$gte": 20}},
            {"price": {"$lt": 100}},
            {"category": {"$eq": "electronics"}}
        ]
    },
    include_metadata=True
)

Batch Operations

Batch Upsert

import time

# Prepare batch
batch = []
for i, product in enumerate(products):
    embedding = get_embedding(product["name"])
    batch.append((
        f"p{i}",
        embedding,
        {"name": product["name"]}
    ))

# Upsert in batches (recommended for large datasets)
batch_size = 100
for i in range(0, len(batch), batch_size):
    index.upsert(vectors=batch[i:i+batch_size])
    time.sleep(1)  # Rate limiting

Batch Delete

# Delete by ID
ids_to_delete = ["p1", "p2"]
index.delete(ids=ids_to_delete)

# Delete by filter
index.delete(filter={"price": {"$lt": 10}})

Hybrid search combines vector similarity with keyword matching.

# Pinecone supports hybrid search with keyword filtering
# Use with LangChain for easier integration

from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()
vectorstore = PineconeVectorStore(
    index=index,
    embedding=embeddings,
    text_key="text"
)

# Search with metadata filter
docs = vectorstore.similarity_search_with_score(
    "electronics",
    k=5,
    filter={"price": {"$lt": 500}}
)

Managing Indexes at Scale

Index Statistics

# Get index stats
stats = index.describe_index_stats()
print(f"Dimension: {stats['dimension']}")
print(f"Index fullness: {stats['index_fullness']}")
print(f"Total vector count: {stats['total_vector_count']}")
print(f"Namespaces: {stats['namespaces']}")

Namespaces

Organize data by namespace:

# Upsert to specific namespace
index.upsert(
    vectors=vectors_to_upsert,
    namespace="products-2024"
)

# Query from namespace
results = index.query(
    vector=query_embedding,
    top_k=3,
    namespace="products-2024"
)

Integration with LangChain

from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA

# Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = PineconeVectorStore(
    index=index,
    embedding=embeddings
)

# Create QA chain
qa = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

# Query
answer = qa.run("What products are under 100 dollars?")

Cost Optimization

Strategies

  1. Use appropriate index type: Serverless for variable load, Pod for consistent load
  2. Delete old data: Remove outdated vectors to reduce storage
  3. Batch operations: Combine multiple operations
  4. Monitor usage: Track vector count and queries
# Monitor index stats
stats = index.describe_index_stats()
total_vectors = stats['total_vector_count']
print(f"Storage estimate: {total_vectors * 0.05} cents/month")

Production Best Practices

  1. Use API key rotation: Rotate keys regularly
  2. Implement retry logic: Handle network failures
  3. Monitor latency: Track query performance
  4. Use namespaces: Isolate data by environment
  5. Set TTL metadata: Automatically expire old vectors
# Set metadata with timestamp for expiration
import time

metadata_with_expiry = {
    "text": "product info",
    "created_at": time.time(),
    "ttl": time.time() + 86400 * 30  # 30 days
}

Conclusion

Pinecone provides a production-grade vector database with minimal operational overhead. Its serverless architecture, filtering capabilities, and hybrid search make it ideal for RAG applications at scale.

FAQ

Q: What's the pricing model? A: Pinecone charges per vector per month plus query costs. Serverless is pay-as-you-go; Pod indexes have fixed costs.

Q: Can I export data from Pinecone? A: Yes, you can query all vectors and export them, but there's no built-in bulk export. Plan accordingly.

Q: How does Pinecone handle data redundancy? A: Automatically replicated across availability zones for high availability and durability.

Advertisement

Sanjeev Sharma

Written by

Sanjeev Sharma

Full Stack Engineer · E-mopro