Published on

AI Personalization at Scale — User Profiles, Preference Learning, and Context Injection

Authors
  • Name
    Twitter

Introduction

Generic AI responses feel impersonal. Personalized responses that account for user preferences, history, and context feel crafted. This post covers building personalization systems that scale from thousands to millions of users—from explicit preference storage to embedding-based similarity search to privacy-preserving techniques.

User Preference Storage

Store both explicit ratings and implicit signals from user behavior. Build preference profiles over time.

interface UserPreference {
  userId: string;
  preferenceId: string;
  category: string;
  value: string;
  rating: number; // -1 to 1 (dislike to like)
  source: "explicit" | "implicit";
  timestamp: Date;
  weight: number; // Importance multiplier
}

interface UserPreferenceProfile {
  userId: string;
  preferences: UserPreference[];
  lastUpdated: Date;
  totalRatings: number;
  avgRatingScore: number;
}

class PreferenceStore {
  private preferences = new Map<string, UserPreference[]>();

  recordExplicitPreference(
    userId: string,
    category: string,
    value: string,
    rating: number
  ): void {
    const pref: UserPreference = {
      userId,
      preferenceId: `${userId}:${category}:${value}`,
      category,
      value,
      rating, // User explicitly rated this
      source: "explicit",
      timestamp: new Date(),
      weight: 1.0,
    };

    const userPrefs = this.preferences.get(userId) || [];
    userPrefs.push(pref);
    this.preferences.set(userId, userPrefs);
  }

  recordImplicitPreference(
    userId: string,
    category: string,
    value: string,
    engagement: "click" | "like" | "share" | "save"
  ): void {
    // Map engagement to implicit preference
    const ratingMap = {
      click: 0.3,
      like: 0.6,
      share: 0.8,
      save: 0.9,
    };

    const pref: UserPreference = {
      userId,
      preferenceId: `${userId}:${category}:${value}`,
      category,
      value,
      rating: ratingMap[engagement],
      source: "implicit",
      timestamp: new Date(),
      weight: 0.5, // Weight implicit signals less than explicit
    };

    const userPrefs = this.preferences.get(userId) || [];

    // Check if we already have a preference for this item
    const existing = userPrefs.find(
      (p) =>
        p.category === category &&
        p.value === value &&
        p.source === "implicit"
    );

    if (existing) {
      // Update existing implicit preference (average with new signal)
      existing.rating = (existing.rating + pref.rating) / 2;
      existing.timestamp = new Date();
    } else {
      userPrefs.push(pref);
    }

    this.preferences.set(userId, userPrefs);
  }

  getPreferenceProfile(userId: string): UserPreferenceProfile {
    const userPrefs = this.preferences.get(userId) || [];

    const totalRating = userPrefs.reduce((sum, p) => sum + p.rating, 0);
    const avgRating = userPrefs.length > 0 ? totalRating / userPrefs.length : 0;

    return {
      userId,
      preferences: userPrefs,
      lastUpdated: userPrefs[userPrefs.length - 1]?.timestamp || new Date(),
      totalRatings: userPrefs.length,
      avgRatingScore: avgRating,
    };
  }

  getTopPreferences(
    userId: string,
    category?: string,
    limit: number = 5
  ): UserPreference[] {
    let userPrefs = this.preferences.get(userId) || [];

    if (category) {
      userPrefs = userPrefs.filter((p) => p.category === category);
    }

    return userPrefs
      .sort((a, b) => (b.rating * b.weight) - (a.rating * a.weight))
      .slice(0, limit);
  }
}

Embedding User History for Personalization

Convert user history into vector embeddings for similarity-based retrieval and recommendations.

import Anthropic from "@anthropic-ai/sdk";

interface UserHistoryItem {
  id: string;
  content: string;
  category: string;
  timestamp: Date;
  embedding?: number[];
}

interface EmbeddingConfig {
  model: string;
  dimension: number;
}

class UserHistoryEmbedder {
  private client: Anthropic;

  constructor() {
    this.client = new Anthropic();
  }

  async embedUserHistory(
    history: UserHistoryItem[]
  ): Promise<UserHistoryItem[]> {
    // Note: Claude doesn''t provide embeddings API directly.
    // In production, use a dedicated embedding model like text-embedding-3-small

    const embeddedItems: UserHistoryItem[] = [];

    for (const item of history) {
      // Simplified: in production, call your embedding service
      const embedding = this.generateMockEmbedding(item.content);

      embeddedItems.push({
        ...item,
        embedding,
      });
    }

    return embeddedItems;
  }

  private generateMockEmbedding(text: string): number[] {
    // Mock embedding for demonstration
    // In production, use real embedding model
    const dimension = 1536;
    const embedding = new Array(dimension);

    for (let i = 0; i < dimension; i++) {
      const hash = text
        .split("")
        .reduce((acc, char) => acc + char.charCodeAt(0), 0);
      embedding[i] = Math.sin(hash + i) * 0.5 + 0.5;
    }

    return embedding;
  }

  calculateSimilarity(emb1: number[], emb2: number[]): number {
    // Cosine similarity
    let dotProduct = 0;
    let mag1 = 0;
    let mag2 = 0;

    for (let i = 0; i < emb1.length; i++) {
      dotProduct += emb1[i] * emb2[i];
      mag1 += emb1[i] * emb1[i];
      mag2 += emb2[i] * emb2[i];
    }

    return dotProduct / (Math.sqrt(mag1) * Math.sqrt(mag2));
  }

  findSimilarItems(
    targetEmbedding: number[],
    historyEmbeddings: UserHistoryItem[],
    topK: number = 5
  ): UserHistoryItem[] {
    const similarities = historyEmbeddings.map((item) => ({
      item,
      similarity: this.calculateSimilarity(
        targetEmbedding,
        item.embedding || []
      ),
    }));

    return similarities
      .sort((a, b) => b.similarity - a.similarity)
      .slice(0, topK)
      .map((s) => s.item);
  }
}

// Usage: Build user preference vector from history
async function buildUserPreferenceVector(
  userId: string,
  history: UserHistoryItem[]
): Promise<number[]> {
  const embedder = new UserHistoryEmbedder();
  const embeddedHistory = await embedder.embedUserHistory(history);

  // Average embeddings to create user preference vector
  const dimension = embeddedHistory[0]?.embedding?.length || 1536;
  const userVector = new Array(dimension).fill(0);

  embeddedHistory.forEach((item) => {
    item.embedding?.forEach((val, idx) => {
      userVector[idx] += val;
    });
  });

  return userVector.map((val) => val / embeddedHistory.length);
}

User Profile Vector as Query Context

Use user profile vectors directly as context in LLM queries for personalized responses.

interface PersonalizedContext {
  userId: string;
  userVector: number[];
  preferences: string[];
  history: string;
  demographics?: {
    region?: string;
    industry?: string;
    experience?: "beginner" | "intermediate" | "expert";
  };
}

function buildPersonalizedPrompt(
  basePrompt: string,
  context: PersonalizedContext
): string {
  const contextSection = `
User Profile Context:
- User ID: ${context.userId}
- Preferences: ${context.preferences.join(", ")}
- Recent interactions: ${context.history}
- Experience level: ${context.demographics?.experience || "unknown"}
- Region: ${context.demographics?.region || "unknown"}

Tailor your response to match the user''s interests and expertise level.
`;

  return basePrompt + "\n" + contextSection;
}

async function generatePersonalizedResponse(
  userQuery: string,
  userContext: PersonalizedContext
): Promise<string> {
  const basePrompt = `You are a helpful assistant providing personalized recommendations.`;
  const fullPrompt = buildPersonalizedPrompt(basePrompt, userContext);

  const client = new Anthropic();

  const response = await client.messages.create({
    model: "claude-3-5-sonnet-20241022",
    max_tokens: 1024,
    system: fullPrompt,
    messages: [{ role: "user", content: userQuery }],
  });

  return response.content[0].type === "text" ? response.content[0].text : "";
}

Preference Learning from Feedback

Continuously learn user preferences from explicit feedback (thumbs up/down) and implicit signals.

interface FeedbackSignal {
  userId: string;
  responseId: string;
  type: "thumbs-up" | "thumbs-down" | "report" | "edit";
  metadata?: Record<string, unknown>;
  timestamp: Date;
}

class PreferenceLearner {
  private feedbackHistory: FeedbackSignal[] = [];

  recordFeedback(signal: FeedbackSignal): void {
    this.feedbackHistory.push(signal);
  }

  async updatePreferencesFromFeedback(
    userId: string,
    preferences: UserPreference[]
  ): Promise<UserPreference[]> {
    // Get recent feedback for this user
    const recentFeedback = this.feedbackHistory.filter(
      (f) => f.userId === userId &&
        f.timestamp > new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) // Last 30 days
    );

    // Update preference weights based on feedback
    const updated = [...preferences];

    recentFeedback.forEach((feedback) => {
      if (feedback.type === "thumbs-up") {
        // Increase weight of similar preferences
        updated.forEach((pref) => {
          pref.weight *= 1.1; // 10% increase
        });
      } else if (feedback.type === "thumbs-down") {
        // Decrease weight
        updated.forEach((pref) => {
          pref.weight *= 0.9; // 10% decrease
        });
      }
    });

    // Normalize weights to 1.0 max
    const maxWeight = Math.max(...updated.map((p) => p.weight));
    return updated.map((p) => ({ ...p, weight: p.weight / maxWeight }));
  }

  // Detect preference drift (user preferences changing over time)
  detectPreferenceDrift(
    userId: string,
    period: number = 30
  ): {
    drifted: boolean;
    direction: "positive" | "negative" | "stable";
    changePercent: number;
  } {
    const oldFeedback = this.feedbackHistory.filter(
      (f) =>
        f.userId === userId &&
        f.timestamp > new Date(Date.now() - period * 2 * 24 * 60 * 60 * 1000) &&
        f.timestamp < new Date(Date.now() - period * 24 * 60 * 60 * 1000)
    );

    const newFeedback = this.feedbackHistory.filter(
      (f) =>
        f.userId === userId &&
        f.timestamp > new Date(Date.now() - period * 24 * 60 * 60 * 1000)
    );

    const oldPositive = oldFeedback.filter(
      (f) => f.type === "thumbs-up"
    ).length;
    const newPositive = newFeedback.filter(
      (f) => f.type === "thumbs-up"
    ).length;

    const changePercent =
      oldPositive > 0
        ? ((newPositive - oldPositive) / oldPositive) * 100
        : 0;

    return {
      drifted: Math.abs(changePercent) > 20, // >20% change triggers drift
      direction:
        changePercent > 0 ? "positive" : changePercent < 0 ? "negative" : "stable",
      changePercent: Math.abs(changePercent),
    };
  }
}

Privacy-Preserving Personalization

Implement personalization with privacy guarantees using on-device processing and data minimization.

interface PrivacyConfig {
  processOnDevice: boolean;
  minimalDataRetention: boolean;
  encryptUserProfiles: boolean;
  anonymizeLogging: boolean;
  dataRetentionDaysCap?: number;
}

class PrivacyAwarePersonalizer {
  private config: PrivacyConfig;

  constructor(config: PrivacyConfig) {
    this.config = config;
  }

  async generatePersonalizedResponsePrivacy(
    userQuery: string,
    userProfile: UserPreferenceProfile,
    privacyConfig: PrivacyConfig
  ): Promise<string> {
    if (privacyConfig.processOnDevice) {
      // Generate response locally without sending user profile to LLM
      return this.generateResponseLocally(userQuery, userProfile);
    } else {
      // Send minimal context to LLM
      const minimalContext = this.extractMinimalContext(userProfile);
      return this.generateResponseWithMinimalData(userQuery, minimalContext);
    }
  }

  private generateResponseLocally(
    userQuery: string,
    profile: UserPreferenceProfile
  ): string {
    // Use local rules and templates based on preferences
    const topPrefs = profile.preferences
      .sort((a, b) => b.rating - a.rating)
      .slice(0, 3);

    const contextString = topPrefs.map((p) => `${p.category}: ${p.value}`).join(", ");

    return `Based on your interests in ${contextString}, here''s what I recommend...`;
  }

  private extractMinimalContext(
    profile: UserPreferenceProfile
  ): Record<string, string> {
    // Only include aggregate statistics, not raw preferences
    return {
      preferenceCount: String(profile.totalRatings),
      avgRating: profile.avgRatingScore.toFixed(2),
      lastActiveDate: profile.lastUpdated.toISOString().split("T")[0],
    };
  }

  private async generateResponseWithMinimalData(
    userQuery: string,
    minimalContext: Record<string, string>
  ): Promise<string> {
    const client = new Anthropic();

    const prompt = `User context: ${JSON.stringify(minimalContext)}. User query: ${userQuery}`;

    const response = await client.messages.create({
      model: "claude-3-5-sonnet-20241022",
      max_tokens: 1024,
      messages: [{ role: "user", content: prompt }],
    });

    return response.content[0].type === "text" ? response.content[0].text : "";
  }

  async deleteUserData(userId: string): Promise<void> {
    if (this.config.minimalDataRetention) {
      // Delete user profile after short retention period
      console.log(`Deleting profile for user ${userId} per retention policy`);
    }
  }
}

Cold-Start Personalization

Personalize for new users without historical data using cohort analysis and default recommendations.

interface UserCohort {
  cohortId: string;
  name: string;
  description: string;
  defaultPreferences: UserPreference[];
  demographics?: Record<string, string>;
}

class ColdStartPersonalizer {
  private cohorts: Record<string, UserCohort> = {
    "tech-enthusiasts": {
      cohortId: "tech-enthusiasts",
      name: "Tech Enthusiasts",
      description: "Users interested in technology and engineering",
      defaultPreferences: [
        {
          userId: "default",
          preferenceId: "category:technology",
          category: "technology",
          value: "software-development",
          rating: 0.8,
          source: "explicit",
          timestamp: new Date(),
          weight: 1.0,
        },
      ],
    },

    "business-users": {
      cohortId: "business-users",
      name: "Business Users",
      description: "Users focused on business and productivity",
      defaultPreferences: [
        {
          userId: "default",
          preferenceId: "category:business",
          category: "business",
          value: "productivity",
          rating: 0.8,
          source: "explicit",
          timestamp: new Date(),
          weight: 1.0,
        },
      ],
    },
  };

  assignCohort(userSignupData: {
    industry?: string;
    role?: string;
    companySize?: string;
  }): UserCohort {
    // Simple heuristic to assign cohort
    if (
      userSignupData.industry === "technology" ||
      userSignupData.role === "engineer"
    ) {
      return this.cohorts["tech-enthusiasts"];
    }

    return this.cohorts["business-users"];
  }

  getOnboardingQuestions(): Array<{
    questionId: string;
    question: string;
    options: string[];
  }> {
    return [
      {
        questionId: "industry",
        question: "What industry are you in?",
        options: ["Technology", "Finance", "Healthcare", "Education", "Other"],
      },
      {
        questionId: "role",
        question: "What''s your role?",
        options: ["Engineer", "Product Manager", "Designer", "Other"],
      },
      {
        questionId: "experience",
        question: "How many years of experience do you have?",
        options: ["Less than 1", "1-3", "3-5", "5+"],
      },
    ];
  }

  async getPersonalizedOnboardingContent(
    userResponses: Record<string, string>
  ): Promise<string> {
    const client = new Anthropic();

    const prompt = `Based on user profile: ${JSON.stringify(userResponses)}, provide personalized onboarding tips.`;

    const response = await client.messages.create({
      model: "claude-3-5-sonnet-20241022",
      max_tokens: 512,
      messages: [{ role: "user", content: prompt }],
    });

    return response.content[0].type === "text" ? response.content[0].text : "";
  }
}

A/B Testing Personalization Algorithms

Compare personalization strategies to measure impact on user satisfaction and engagement.

type PersonalizationStrategy = "embedding-based" | "rule-based" | "control";

interface PersonalizationABTest {
  testId: string;
  strategyA: PersonalizationStrategy;
  strategyB: PersonalizationStrategy;
  splitPercent: number;
}

interface PersonalizationMetrics {
  strategy: PersonalizationStrategy;
  userSatisfaction: number; // 0-1 scale
  engagementRate: number;
  clickThroughRate: number;
  conversionRate: number;
}

class PersonalizationABTester {
  assignStrategy(
    userId: string,
    test: PersonalizationABTest
  ): PersonalizationStrategy {
    // Deterministic split
    const hash = userId
      .split("")
      .reduce((acc, c) => acc + c.charCodeAt(0), 0);

    if ((hash % 100) < test.splitPercent) {
      return test.strategyA;
    }
    return test.strategyB;
  }

  async evaluateStrategy(
    strategy: PersonalizationStrategy,
    metrics: PersonalizationMetrics[]
  ): Promise<{
    averageSatisfaction: number;
    staticallySignificant: boolean;
  }> {
    const strategyMetrics = metrics.filter(
      (m) => m.strategy === strategy
    );

    const avgSatisfaction =
      strategyMetrics.reduce((sum, m) => sum + m.userSatisfaction, 0) /
      strategyMetrics.length;

    // Simple significance test: need >30 samples and >15% difference
    const isSignificant =
      strategyMetrics.length >= 30 && avgSatisfaction > 0.15;

    return {
      averageSatisfaction: avgSatisfaction,
      staticallySignificant: isSignificant,
    };
  }
}

GDPR and User Data Deletion

Implement right-to-be-forgotten and data minimization from the start.

interface DataDeletionRequest {
  userId: string;
  requestId: string;
  requestedAt: Date;
  reason?: string;
  completedAt?: Date;
}

class GDPRCompliantPersonalizer {
  private deletionRequests: DataDeletionRequest[] = [];

  async handleUserDeletionRequest(
    userId: string
  ): Promise<DataDeletionRequest> {
    const request: DataDeletionRequest = {
      userId,
      requestId: `deletion-${Date.now()}`,
      requestedAt: new Date(),
    };

    // Delete all user data
    await this.deleteUserPreferences(userId);
    await this.deleteUserHistory(userId);
    await this.deleteUserFeedback(userId);
    await this.deleteUserEmbeddings(userId);

    request.completedAt = new Date();
    this.deletionRequests.push(request);

    return request;
  }

  private async deleteUserPreferences(userId: string): Promise<void> {
    console.log(`Deleting preferences for user ${userId}`);
  }

  private async deleteUserHistory(userId: string): Promise<void> {
    console.log(`Deleting history for user ${userId}`);
  }

  private async deleteUserFeedback(userId: string): Promise<void> {
    console.log(`Deleting feedback for user ${userId}`);
  }

  private async deleteUserEmbeddings(userId: string): Promise<void> {
    console.log(`Deleting embeddings for user ${userId}`);
  }

  getDataDeletionStatus(requestId: string): DataDeletionRequest | null {
    return (
      this.deletionRequests.find((r) => r.requestId === requestId) || null
    );
  }

  // Export user data per GDPR
  async exportUserData(userId: string): Promise<object> {
    return {
      userId,
      preferences: [], // Get from store
      history: [], // Get from store
      feedback: [], // Get from store
      exportedAt: new Date(),
    };
  }
}

Checklist

  • Store both explicit (user-rated) and implicit (engagement-based) preferences
  • Convert user history to embeddings for similarity-based retrieval
  • Inject user preference vectors into system prompts
  • Learn from feedback signals and detect preference drift
  • Implement privacy-preserving personalization with on-device processing
  • Use cohort analysis and onboarding questions for cold-start users
  • A/B test personalization strategies to measure business impact
  • Implement GDPR-compliant data deletion and export features

Conclusion

Personalization at scale requires infrastructure from day one. Start by recording explicit preferences and implicit signals. Convert historical interactions to embeddings for similarity search. Use preference vectors directly in prompts for personalized responses. Continuously learn from user feedback and detect preference drift. Build privacy preservation into the core architecture, not as an afterthought. For new users, leverage cohort analysis and onboarding surveys. Finally, implement proper GDPR compliance with easy data deletion and export. With these patterns in place, you''ll deliver personalized AI experiences that users prefer while respecting their privacy.