- Published on
AI Recommendation Systems — Embedding-Based Collaborative Filtering at Scale
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
Recommendation systems drive revenue and engagement. This guide covers embedding-based collaborative filtering, two-tower architectures, and production techniques for personalization at scale.
- Content-Based Filtering with Embeddings
- Collaborative Filtering with User Embeddings
- Two-Tower Model Architecture
- Real-Time vs Batch Recommendations
- Cold Start Problem Solutions
- Diversity in Recommendations
- Contextual Recommendations
- A/B Testing Recommendation Algorithms
- Embedding Freshness and Updates
- Checklist
- Conclusion
Content-Based Filtering with Embeddings
Recommend items similar to ones a user has interacted with:
interface Item {
id: string;
title: string;
embedding: number[];
metadata: { category: string; price: number };
}
interface UserProfile {
userId: string;
interactedItems: string[];
preferences: number[]; // Aggregated embedding
}
interface Recommendation {
itemId: string;
score: number;
reason: string;
}
function cosineSimilarity(a: number[], b: number[]): number {
let dotProduct = 0;
let normA = 0;
let normB = 0;
for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
}
async function contentBasedRecommendations(
user: UserProfile,
items: Item[],
topK: number = 5
): Promise<Recommendation[]> {
// Create aggregated user preference vector
const userEmbedding = Array(items[0].embedding.length).fill(0);
user.interactedItems.forEach(itemId => {
const item = items.find(i => i.id === itemId);
if (item) {
item.embedding.forEach((val, idx) => {
userEmbedding[idx] += val;
});
}
});
// Normalize
const norm = Math.sqrt(userEmbedding.reduce((sum, v) => sum + v * v, 0));
userEmbedding.forEach((_, i) => {
userEmbedding[i] /= norm || 1;
});
// Find similar items
const recommendations: Recommendation[] = items
.filter(item => !user.interactedItems.includes(item.id))
.map(item => ({
itemId: item.id,
score: cosineSimilarity(userEmbedding, item.embedding),
reason: `Similar to items you liked`
}))
.sort((a, b) => b.score - a.score)
.slice(0, topK);
return recommendations;
}
// Strengths: Works for new items immediately (no training needed)
// Weaknesses: Can''t discover new types user hasn''t seen before
Collaborative Filtering with User Embeddings
Learn user preferences from their interaction patterns:
interface UserInteraction {
userId: string;
itemId: string;
interaction: 'view' | 'click' | 'purchase' | 'like';
timestamp: number;
weight: number; // Purchase > Like > Click > View
}
interface CollaborativeModel {
userEmbeddings: Map<string, number[]>;
itemEmbeddings: Map<string, number[]>;
embeddingDim: number;
}
async function trainCollaborativeFiltering(
interactions: UserInteraction[],
embeddingDim: number = 128,
epochs: number = 10
): Promise<CollaborativeModel> {
const userIds = new Set(interactions.map(i => i.userId));
const itemIds = new Set(interactions.map(i => i.itemId));
// Initialize random embeddings
const userEmbeddings = new Map<string, number[]>();
const itemEmbeddings = new Map<string, number[]>();
userIds.forEach(userId => {
userEmbeddings.set(
userId,
Array(embeddingDim)
.fill(0)
.map(() => (Math.random() - 0.5) * 0.1)
);
});
itemIds.forEach(itemId => {
itemEmbeddings.set(
itemId,
Array(embeddingDim)
.fill(0)
.map(() => (Math.random() - 0.5) * 0.1)
);
});
// Gradient descent training (simplified)
const learningRate = 0.001;
for (let epoch = 0; epoch < epochs; epoch++) {
let totalLoss = 0;
interactions.forEach(interaction => {
const userEmb = userEmbeddings.get(interaction.userId)!;
const itemEmb = itemEmbeddings.get(interaction.itemId)!;
// Predict: dot product of embeddings
let prediction = 0;
for (let i = 0; i < embeddingDim; i++) {
prediction += userEmb[i] * itemEmb[i];
}
// Clamp to 0-1 range (rating scale)
prediction = Math.tanh(prediction);
// Target: interaction weight (0.25 view, 0.5 click, 0.75 like, 1.0 purchase)
const target = interaction.weight;
const error = target - prediction;
totalLoss += error * error;
// Update embeddings (simplified SGD)
for (let i = 0; i < embeddingDim; i++) {
const userGrad = -2 * error * itemEmb[i];
const itemGrad = -2 * error * userEmb[i];
userEmb[i] -= learningRate * userGrad;
itemEmb[i] -= learningRate * itemGrad;
}
});
if (epoch % 2 === 0) {
console.log(`Epoch ${epoch}: Loss = ${totalLoss.toFixed(4)}`);
}
}
return {
userEmbeddings,
itemEmbeddings,
embeddingDim
};
}
async function predictWithCollaborative(
model: CollaborativeModel,
userId: string,
itemIds: string[]
): Promise<Array<{ itemId: string; score: number }>> {
const userEmb = model.userEmbeddings.get(userId);
if (!userEmb) return [];
return itemIds
.map(itemId => {
const itemEmb = model.itemEmbeddings.get(itemId);
if (!itemEmb) return { itemId, score: 0 };
// Dot product (inner product)
let score = 0;
for (let i = 0; i < model.embeddingDim; i++) {
score += userEmb[i] * itemEmb[i];
}
// Scale to 0-1 range
return { itemId, score: Math.tanh(score) };
})
.sort((a, b) => b.score - a.score);
}
// Strengths: Discovers new items based on similar user patterns
// Weaknesses: Cold start for new users/items, matrix factorization limitations
Two-Tower Model Architecture
Separate user and item towers for scalable inference:
interface TwoTowerModel {
userTower: (features: number[]) => number[]; // User encoder
itemTower: (features: number[]) => number[]; // Item encoder
embeddingDim: number;
}
async function buildTwoTowerModel(
userFeatureDim: number,
itemFeatureDim: number,
embeddingDim: number = 256
): Promise<TwoTowerModel> {
// In production, use TensorFlow.js or similar
// This is conceptual
const userTower = (features: number[]): number[] => {
// Dense layer 1: 64 neurons
const hidden1 = Array(64)
.fill(0)
.map((_, i) =>
features.reduce((sum, f, j) => sum + f * Math.random(), 0)
);
// Dense layer 2: embedding dimension
return Array(embeddingDim)
.fill(0)
.map((_, i) =>
hidden1.reduce((sum, h, j) => sum + h * Math.random(), 0)
);
};
const itemTower = (features: number[]): number[] => {
const hidden1 = Array(64)
.fill(0)
.map((_, i) =>
features.reduce((sum, f, j) => sum + f * Math.random(), 0)
);
return Array(embeddingDim)
.fill(0)
.map((_, i) =>
hidden1.reduce((sum, h, j) => sum + h * Math.random(), 0)
);
};
return {
userTower,
itemTower,
embeddingDim
};
}
interface UserFeatures {
pastCategories: number[]; // Embedding of user''s category preferences
seasonality: number[]; // Time-based features
demographic: number[]; // Age, location, etc.
}
interface ItemFeatures {
contentEmbedding: number[];
popularity: number;
freshness: number; // How recent the item is
}
async function scoreWithTwoTower(
model: TwoTowerModel,
userFeatures: UserFeatures,
itemFeatures: ItemFeatures
): Promise<number> {
// Combine all user features
const userInput = [
...userFeatures.pastCategories,
...userFeatures.seasonality,
...userFeatures.demographic
];
// Combine all item features
const itemInput = [
...itemFeatures.contentEmbedding,
itemFeatures.popularity,
itemFeatures.freshness
];
// Get embeddings from towers
const userEmb = model.userTower(userInput);
const itemEmb = model.itemTower(itemInput);
// Dot product for scoring
let score = 0;
for (let i = 0; i < model.embeddingDim; i++) {
score += userEmb[i] * itemEmb[i];
}
return Math.tanh(score); // Normalize to 0-1
}
// Advantages:
// - User embedding computed once, compared against pre-computed item embeddings
// - Can pre-compute and cache all item embeddings
// - Scale to millions of items with single forward pass per user
Real-Time vs Batch Recommendations
interface RecommendationStrategy {
latencyMs: number;
computeFrequency: 'real-time' | 'hourly' | 'daily';
scalability: 'millions' | 'billions';
model: 'two-tower' | 'collaborative' | 'hybrid';
}
async function realtimeRecommendations(
userId: string,
candidateItems: Item[],
userModel: CollaborativeModel
): Promise<Recommendation[]> {
// Compute user embedding on-demand
// Compare against pre-computed item embeddings
// Return top-K within 100ms SLA
const userEmb = userModel.userEmbeddings.get(userId);
if (!userEmb) return [];
const scores = candidateItems.map(item => {
const itemEmb = userModel.itemEmbeddings.get(item.id);
if (!itemEmb) return { item, score: 0 };
let score = 0;
for (let i = 0; i < userModel.embeddingDim; i++) {
score += userEmb[i] * itemEmb[i];
}
return { item, score };
});
return scores
.sort((a, b) => b.score - a.score)
.slice(0, 10)
.map(s => ({
itemId: s.item.id,
score: s.score,
reason: 'Recommended for you'
}));
}
async function batchRecommendations(
userIds: string[],
items: Item[],
model: CollaborativeModel
): Promise<Map<string, Recommendation[]>> {
// Precompute recommendations for all users
// Store in cache, serve from cache in real-time
// Update every hour or daily
const recommendations = new Map<string, Recommendation[]>();
for (const userId of userIds) {
const recs = await realtimeRecommendations(
userId,
items,
model
);
recommendations.set(userId, recs);
}
// Store in Redis or similar for fast retrieval
return recommendations;
}
// Real-time: 10-100ms latency, compute on-demand
// Batch: 0ms latency (cache), update every hour, requires storage
// Hybrid: Cache real-time results, expire and refresh periodically
Cold Start Problem Solutions
interface ColdStartStrategy {
situation: 'new_user' | 'new_item';
approach: string;
effectiveness: '0-6-months' | 'permanent';
}
async function handleNewUser(
userId: string,
userFeatures: Partial<UserFeatures>,
items: Item[]
): Promise<Recommendation[]> {
// Approach 1: Content-based on user features (if provided)
if (userFeatures.demographic) {
// Find users similar in demographics
// Return recommendations from similar users
}
// Approach 2: Popularity-based fallback
const popularItems = items
.sort((a, b) => (b.metadata.price || 0) - (a.metadata.price || 0))
.slice(0, 10);
return popularItems.map(item => ({
itemId: item.id,
score: 0.5,
reason: 'Popular with similar users'
}));
}
async function handleNewItem(
item: Item,
users: string[],
itemEmbeddings: Map<string, number[]>,
userEmbeddings: Map<string, number[]>
): Promise<{ userId: string; score: number }[]> {
// Find users interested in similar items
const similarItems = Array.from(itemEmbeddings.entries())
.filter(([id, _]) => id !== item.id)
.map(([id, emb]) => ({
id,
similarity: cosineSimilarity(item.embedding, emb)
}))
.sort((a, b) => b.similarity - a.similarity)
.slice(0, 10);
const targetUsers: { userId: string; score: number }[] = [];
for (const similarItem of similarItems) {
// Find users who interacted with similar items
// Recommend new item to them
}
return targetUsers;
}
// New users: Show popular items, ask for preferences
// New items: Recommend to fans of similar items, use content similarity
// Typical: Cold start period < 2 weeks with active feedback
Diversity in Recommendations
interface DiversityMetrics {
categorySpread: number; // % of top 10 in different categories
noveltyScore: number; // How different from user''s past items
popularityDistribution: number; // Spread of item popularity
}
async function diversifyRecommendations(
recommendations: Recommendation[],
userHistory: Item[]
): Promise<Recommendation[]> {
// Start with top recommendations
const diversified: Recommendation[] = [];
const categoriesSeen = new Set<string>();
const maxPerCategory = 3;
// First: Strict best matches
recommendations
.slice(0, 3)
.forEach(rec => diversified.push(rec));
// Then: Balance across categories
for (const rec of recommendations.slice(3)) {
const category = 'unknown'; // Would fetch from item metadata
if (!categoriesSeen.has(category) || categoriesSeen.size < 5) {
diversified.push(rec);
categoriesSeen.add(category);
if (diversified.length >= 10) break;
}
}
// Shuffle middle recommendations (avoid predictable order)
const shuffled = [
...diversified.slice(0, 2),
...diversified.slice(2).sort(() => Math.random() - 0.5)
];
return shuffled;
}
// Target: 70% similar + 30% diverse
// Prevents recommendation echo chamber
// Increases user discovery and satisfaction
Contextual Recommendations
interface Context {
timeOfDay: number; // 0-24
dayOfWeek: number; // 0-6
season: 'spring' | 'summer' | 'fall' | 'winter';
location?: string;
device: 'mobile' | 'desktop' | 'tablet';
}
async function contextualRecommendations(
userId: string,
context: Context,
baseRecommendations: Recommendation[]
): Promise<Recommendation[]> {
// Rerank based on context
// Morning: News, productivity apps
// Evening: Entertainment, social
// Weekend: Different from weekday
const timeWeight = 1.0;
let contextBoost: { [itemId: string]: number } = {};
if (context.timeOfDay >= 6 && context.timeOfDay < 12) {
// Morning: boost productivity items
contextBoost = { 'news_1': 0.2, 'email_app': 0.1 };
} else if (context.timeOfDay >= 18) {
// Evening: boost entertainment
contextBoost = { 'netflix': 0.3, 'spotify': 0.25 };
}
return baseRecommendations.map(rec => ({
...rec,
score: rec.score + (contextBoost[rec.itemId] || 0)
}));
}
// Location: Recommend nearby restaurants, stores
// Time: Morning=news, afternoon=work, evening=entertainment
// Device: Mobile optimizes for quick choices
A/B Testing Recommendation Algorithms
interface ExperimentMetrics {
algorithm: string;
clickThroughRate: number;
conversionRate: number;
avgSessionLength: number;
userSatisfaction: number;
}
async function runRecommendationABTest(
users: string[],
controlModel: CollaborativeModel,
treatmentModel: TwoTowerModel
): Promise<{ winner: string; improvement: number }> {
// Split users 50/50
const testSize = Math.floor(users.length / 2);
const controlUsers = users.slice(0, testSize);
const treatmentUsers = users.slice(testSize);
// Run test for 1 week
// Collect metrics
const controlMetrics: ExperimentMetrics = {
algorithm: 'collaborative',
clickThroughRate: 0.045, // 4.5%
conversionRate: 0.015,
avgSessionLength: 8.5,
userSatisfaction: 4.2 // out of 5
};
const treatmentMetrics: ExperimentMetrics = {
algorithm: 'two-tower',
clickThroughRate: 0.052, // 5.2%
conversionRate: 0.018,
avgSessionLength: 9.1,
userSatisfaction: 4.4
};
const improvement =
((treatmentMetrics.clickThroughRate - controlMetrics.clickThroughRate) /
controlMetrics.clickThroughRate) *
100;
return {
winner: improvement > 5 ? 'treatment' : 'control',
improvement
};
}
// Metrics to track:
// - Click-through rate: % clicking on recommendation
// - Conversion rate: % purchasing after click
// - Session length: Time spent on site
// - User satisfaction: NPS or rating surveys
// - Serendipity: Discovering new categories
Embedding Freshness and Updates
interface EmbeddingIndex {
userEmbeddings: Map<string, { embedding: number[]; timestamp: number }>;
itemEmbeddings: Map<string, { embedding: number[]; timestamp: number }>;
lastFullRetrainingDate: Date;
}
async function updateUserEmbeddingsIncremental(
userId: string,
newInteraction: UserInteraction,
model: CollaborativeModel,
learningRate: number = 0.01
): Promise<void> {
const userEmb = model.userEmbeddings.get(userId);
if (!userEmb) return;
const itemEmb = model.itemEmbeddings.get(newInteraction.itemId);
if (!itemEmb) return;
// Update user embedding based on new interaction (gradient step)
let prediction = 0;
for (let i = 0; i < model.embeddingDim; i++) {
prediction += userEmb[i] * itemEmb[i];
}
const target = newInteraction.weight;
const error = target - Math.tanh(prediction);
for (let i = 0; i < model.embeddingDim; i++) {
const grad = -2 * error * itemEmb[i];
userEmb[i] -= learningRate * grad;
}
// Update timestamp
// Serve updated embeddings immediately
}
async function fullRetrainingSchedule(): Promise<void> {
// Nightly: Retrain entire model
// Improves embeddings with 24 hours of fresh data
// Can take 30-60 minutes for large datasets
console.log('Starting full model retraining...');
// trainCollaborativeFiltering(...);
console.log('Completed. New embeddings live.');
}
// Strategy:
// Real-time: Gradient step updates to user embedding
// Hourly: Recompute user embeddings from interaction logs
// Daily: Full model retraining with 24h of data
// Monthly: Hyperparameter tuning and experimentation
Checklist
- Implemented content-based filtering with item embeddings
- Trained collaborative filtering model on interaction data
- Built two-tower architecture for scalable inference
- Set up real-time recommendation serving (<100ms)
- Handled cold start for new users with popularity fallback
- Addressed new items with content similarity
- Added diversity to prevent echo chambers
- Implemented context-aware reranking (time, device, location)
- Set up A/B test framework for algorithm comparison
- Configured incremental embedding updates
- Scheduled nightly full model retraining
Conclusion
Start with collaborative filtering for personalization, evolve to two-tower model for scalability. Combine with content-based filtering for cold start, diversify recommendations, and run continuous A/B tests. Target 5%+ CTR with <100ms serving latency.