Pinecone Vector Database — Complete Tutorial
Advertisement
Introduction
Pinecone is a fully managed vector database built for scale. This tutorial covers everything from creating your first index to building production applications with semantic search and filtering.
- Getting Started with Pinecone
- Setup and Installation
- Creating Your First Index
- Upserting and Querying Vectors
- Basic Upsert
- Querying
- Metadata and Filtering
- Filtering with Metadata
- Complex Filters
- Batch Operations
- Batch Upsert
- Batch Delete
- Hybrid Search
- Managing Indexes at Scale
- Index Statistics
- Namespaces
- Integration with LangChain
- Cost Optimization
- Strategies
- Production Best Practices
- Conclusion
- FAQ
Getting Started with Pinecone
Setup and Installation
pip install pinecone-client
from pinecone import Pinecone
# Initialize with API key
pc = Pinecone(api_key="your-api-key")
# List existing indexes
print(pc.list_indexes())
Creating Your First Index
from pinecone import Pinecone, ServerlessSpec
pc = Pinecone(api_key="your-api-key")
# Create a serverless index (recommended for most use cases)
pc.create_index(
name="products",
dimension=1536,
metric="cosine",
spec=ServerlessSpec(
cloud="aws",
region="us-east-1"
)
)
# Get the index
index = pc.Index("products")
print(index.describe_index_stats())
Upserting and Querying Vectors
Basic Upsert
from openai import OpenAI
client = OpenAI()
def get_embedding(text: str):
"""Get OpenAI embedding."""
response = client.embeddings.create(
input=text,
model="text-embedding-3-small"
)
return response.data[0].embedding
# Generate embeddings and upsert
products = [
{"id": "p1", "name": "Laptop", "price": 999},
{"id": "p2", "name": "Mouse", "price": 29},
{"id": "p3", "name": "Keyboard", "price": 79},
]
vectors_to_upsert = []
for product in products:
embedding = get_embedding(product["name"])
vectors_to_upsert.append((
product["id"],
embedding,
{"name": product["name"], "price": product["price"]}
))
index.upsert(vectors=vectors_to_upsert)
Querying
# Query by vector
query_text = "wireless input device"
query_embedding = get_embedding(query_text)
results = index.query(
vector=query_embedding,
top_k=3,
include_metadata=True
)
for match in results['matches']:
print(f"ID: {match['id']}")
print(f"Score: {match['score']}")
print(f"Metadata: {match['metadata']}")
Metadata and Filtering
Filtering with Metadata
Pinecone supports filtering using metadata. Create indexes with metadata configuration:
# Query with metadata filter
results = index.query(
vector=query_embedding,
top_k=5,
filter={"price": {"$lt": 100}}, # Price less than 100
include_metadata=True
)
Complex Filters
# Multiple conditions
results = index.query(
vector=query_embedding,
top_k=5,
filter={
"$and": [
{"price": {"$gte": 20}},
{"price": {"$lt": 100}},
{"category": {"$eq": "electronics"}}
]
},
include_metadata=True
)
Batch Operations
Batch Upsert
import time
# Prepare batch
batch = []
for i, product in enumerate(products):
embedding = get_embedding(product["name"])
batch.append((
f"p{i}",
embedding,
{"name": product["name"]}
))
# Upsert in batches (recommended for large datasets)
batch_size = 100
for i in range(0, len(batch), batch_size):
index.upsert(vectors=batch[i:i+batch_size])
time.sleep(1) # Rate limiting
Batch Delete
# Delete by ID
ids_to_delete = ["p1", "p2"]
index.delete(ids=ids_to_delete)
# Delete by filter
index.delete(filter={"price": {"$lt": 10}})
Hybrid Search
Hybrid search combines vector similarity with keyword matching.
# Pinecone supports hybrid search with keyword filtering
# Use with LangChain for easier integration
from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
vectorstore = PineconeVectorStore(
index=index,
embedding=embeddings,
text_key="text"
)
# Search with metadata filter
docs = vectorstore.similarity_search_with_score(
"electronics",
k=5,
filter={"price": {"$lt": 500}}
)
Managing Indexes at Scale
Index Statistics
# Get index stats
stats = index.describe_index_stats()
print(f"Dimension: {stats['dimension']}")
print(f"Index fullness: {stats['index_fullness']}")
print(f"Total vector count: {stats['total_vector_count']}")
print(f"Namespaces: {stats['namespaces']}")
Namespaces
Organize data by namespace:
# Upsert to specific namespace
index.upsert(
vectors=vectors_to_upsert,
namespace="products-2024"
)
# Query from namespace
results = index.query(
vector=query_embedding,
top_k=3,
namespace="products-2024"
)
Integration with LangChain
from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA
# Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = PineconeVectorStore(
index=index,
embedding=embeddings
)
# Create QA chain
qa = RetrievalQA.from_chain_type(
llm=ChatOpenAI(),
chain_type="stuff",
retriever=vectorstore.as_retriever()
)
# Query
answer = qa.run("What products are under 100 dollars?")
Cost Optimization
Strategies
- Use appropriate index type: Serverless for variable load, Pod for consistent load
- Delete old data: Remove outdated vectors to reduce storage
- Batch operations: Combine multiple operations
- Monitor usage: Track vector count and queries
# Monitor index stats
stats = index.describe_index_stats()
total_vectors = stats['total_vector_count']
print(f"Storage estimate: {total_vectors * 0.05} cents/month")
Production Best Practices
- Use API key rotation: Rotate keys regularly
- Implement retry logic: Handle network failures
- Monitor latency: Track query performance
- Use namespaces: Isolate data by environment
- Set TTL metadata: Automatically expire old vectors
# Set metadata with timestamp for expiration
import time
metadata_with_expiry = {
"text": "product info",
"created_at": time.time(),
"ttl": time.time() + 86400 * 30 # 30 days
}
Conclusion
Pinecone provides a production-grade vector database with minimal operational overhead. Its serverless architecture, filtering capabilities, and hybrid search make it ideal for RAG applications at scale.
FAQ
Q: What's the pricing model? A: Pinecone charges per vector per month plus query costs. Serverless is pay-as-you-go; Pod indexes have fixed costs.
Q: Can I export data from Pinecone? A: Yes, you can query all vectors and export them, but there's no built-in bulk export. Plan accordingly.
Q: How does Pinecone handle data redundancy? A: Automatically replicated across availability zones for high availability and durability.
Advertisement