Published on

GDPR Data Deletion Panic — The "Right to Be Forgotten" Request That Takes Six Weeks

Authors

Introduction

GDPR's "right to be forgotten" — formally the right to erasure — requires you to delete a user's personal data across all systems within 30 days of their request. For companies that didn't design for this, the actual scope is terrifying: main database, analytics warehouse, caching layer, object storage, CDN, logs, email service, CRM, third-party tools, and backups. The teams that handle this gracefully built deletion into their data model from the start. Everyone else runs a 30-day audit.

Where User Data Actually Lives

Complete data footprint for a typical user:

Primary systems (usually discovered):
- Main application database: users table, orders, sessions, preferences
- Redis: session cache, user preferences cache

Secondary systems (often missed):
- Analytics database: user events, funnels, cohorts
- S3: uploaded files, profile photos, exported reports
- CloudWatch/logging: user IDs in request logs, error traces
- CDN: cached user-generated content

Third-party (easily forgotten):
- Email service (SendGrid, Mailchimp): contact list, email history
- CRM (Salesforce, HubSpot): lead/contact record
- Customer support (Intercom, Zendesk): tickets, conversations
- Payment processor (Stripe): customer record, payment history
- Analytics (Mixpanel, Amplitude): user events
- Error tracking (Sentry): error events with user context

Backups (the hardest):
- Daily RDS snapshots: user exists in all backups for retention period
- Incremental WAL archives: user transactions in WAL files
- S3 versioning: deleted files still in old versions

Fix 1: Deletion Request Handler

// deletion-request.ts — orchestrate deletion across all systems
interface DeletionRequest {
  userId: string
  email: string
  requestedAt: Date
  deadline: Date  // 30 days from request
  status: 'pending' | 'in_progress' | 'completed' | 'failed'
}

async function processGDPRDeletion(userId: string): Promise<void> {
  const user = await db.query('SELECT * FROM users WHERE id = $1', [userId])
  if (!user.rows[0]) throw new Error(`User ${userId} not found`)

  const { email } = user.rows[0]

  logger.info({ userId, email }, 'Starting GDPR deletion request')

  // 1. Soft delete first — mark as deleted, stop processing
  await db.query(`
    UPDATE users SET
      deletion_requested_at = NOW(),
      email = 'deleted_' || id || '@deleted.invalid',
      name = 'Deleted User',
      phone = NULL,
      address = NULL,
      is_deleted = true
    WHERE id = $1
  `, [userId])

  // 2. Delete from main application tables
  await deletePrimaryData(userId)

  // 3. Delete from caches
  await deleteFromCache(userId, email)

  // 4. Delete uploaded files
  await deleteFromStorage(userId)

  // 5. Delete from third-party services
  await deleteFromThirdParties(userId, email)

  // 6. Mark completion
  await db.query(`
    INSERT INTO deletion_completions (user_id, original_email, completed_at)
    VALUES ($1, $2, NOW())
  `, [userId, email])

  logger.info({ userId }, 'GDPR deletion completed')
}

Fix 2: Systematic Data Deletion Across All Tables

async function deletePrimaryData(userId: string): Promise<void> {
  // Delete in dependency order (children before parents)
  await db.transaction(async (trx) => {
    // Order items → orders → user
    await trx.query('DELETE FROM order_items WHERE order_id IN (SELECT id FROM orders WHERE user_id = $1)', [userId])
    await trx.query('DELETE FROM orders WHERE user_id = $1', [userId])
    await trx.query('DELETE FROM user_addresses WHERE user_id = $1', [userId])
    await trx.query('DELETE FROM user_payment_methods WHERE user_id = $1', [userId])
    await trx.query('DELETE FROM sessions WHERE user_id = $1', [userId])
    await trx.query('DELETE FROM audit_log WHERE actor_id = $1', [userId])

    // For orders that must be retained for legal/financial reasons:
    // Anonymize instead of delete
    await trx.query(`
      UPDATE orders SET
        customer_name = 'Deleted User',
        customer_email = 'deleted@deleted.invalid',
        shipping_address = NULL
      WHERE user_id = $1
      AND created_at > NOW() - INTERVAL '7 years'  -- Tax retention requirement
    `, [userId])
  })
}

async function deleteFromCache(userId: string, email: string): Promise<void> {
  const keys = await redis.keys(`user:${userId}:*`)
  const emailKeys = await redis.keys(`email:${email}:*`)

  if (keys.length > 0) await redis.del(...keys)
  if (emailKeys.length > 0) await redis.del(...emailKeys)

  await redis.del(`session:${userId}`)
}

Fix 3: Third-Party Service Deletion

async function deleteFromThirdParties(userId: string, email: string): Promise<void> {
  const tasks = [
    // SendGrid: delete contact
    async () => {
      await sgClient.request({
        method: 'DELETE',
        url: '/v3/marketing/contacts',
        body: { emails: [email] },
      })
    },

    // Stripe: this is tricky — Stripe has financial record requirements
    // You can't delete the customer record, but you can anonymize it
    async () => {
      const customers = await stripe.customers.list({ email })
      for (const customer of customers.data) {
        await stripe.customers.update(customer.id, {
          email: `deleted-${customer.id}@deleted.invalid`,
          name: 'Deleted Customer',
          metadata: { gdpr_deleted: 'true', deletion_date: new Date().toISOString() },
        })
        // Note: payment history retained per financial regulations
      }
    },

    // Intercom: delete contact
    async () => {
      const contact = await intercom.contacts.find({ email })
      if (contact) {
        await intercom.contacts.delete({ id: contact.id })
      }
    },

    // Mixpanel: delete user profile
    async () => {
      await mixpanel.deleteUser(userId)
    },
  ]

  // Run in parallel, collect failures
  const results = await Promise.allSettled(tasks.map(t => t()))
  const failures = results.filter(r => r.status === 'rejected')

  if (failures.length > 0) {
    logger.error({ userId, failures: failures.map(f => (f as PromiseRejectedResult).reason.message) },
      'Some third-party deletions failed')
    // Don't throw — partial success is better than stopping
    // Log for manual follow-up
  }
}

Fix 4: Design for Deletion From the Start

-- Data architecture that makes GDPR deletion feasible

-- ✅ User PII in a separate table — easy to delete/anonymize
CREATE TABLE user_pii (
  user_id UUID PRIMARY KEY REFERENCES users(id),
  full_name VARCHAR(255),
  email VARCHAR(255),
  phone VARCHAR(50),
  date_of_birth DATE,
  -- All PII in one place, deleting this row removes most PII
  created_at TIMESTAMP DEFAULT NOW()
);

-- ✅ Pseudonymization for analytics — no real PII in analytics tables
-- Use a deterministic hash of user_id that can be revoked
-- When user deletes: delete the mapping, analytics records become truly anonymous
CREATE TABLE user_pseudonym_map (
  user_id UUID PRIMARY KEY,
  pseudonym_id UUID NOT NULL UNIQUE DEFAULT gen_random_uuid(),
  created_at TIMESTAMP DEFAULT NOW()
);

-- Analytics events use pseudonym_id, not user_id
CREATE TABLE user_events (
  id UUID DEFAULT gen_random_uuid(),
  pseudonym_id UUID NOT NULL,  -- NOT user_id
  event_type VARCHAR(100),
  metadata JSONB,
  created_at TIMESTAMP DEFAULT NOW()
);

-- GDPR deletion: delete user_pii + user_pseudonym_map
-- Analytics events remain but are now truly anonymous

Fix 5: Deletion Audit Trail

// GDPR requires you to prove deletion was completed
// Keep evidence of what was deleted, when, and where

await db.query(`
  CREATE TABLE IF NOT EXISTS gdpr_deletion_log (
    id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    user_id UUID NOT NULL,
    original_email TEXT NOT NULL,  -- Only this, not full profile
    requested_at TIMESTAMP NOT NULL,
    completed_at TIMESTAMP,
    systems_deleted JSONB,  -- List of systems where data was deleted
    backup_retention_note TEXT,  -- Explain backup retention policy
    legal_basis TEXT,            -- Why any data was retained
    deleted_by TEXT NOT NULL,    -- Automated system or admin who processed
    created_at TIMESTAMP DEFAULT NOW()
  )
`)

// Record what was deleted
await db.query(`
  INSERT INTO gdpr_deletion_log (
    user_id, original_email, requested_at, completed_at,
    systems_deleted, backup_retention_note, legal_basis, deleted_by
  ) VALUES ($1, $2, $3, NOW(), $4, $5, $6, 'automated_gdpr_processor')
`, [
  userId,
  originalEmail,
  requestedAt,
  JSON.stringify(['main_db', 'redis', 's3', 'sendgrid', 'intercom', 'mixpanel']),
  'Data may persist in database backups for up to 30 days per backup retention policy',
  'Financial transaction records retained 7 years per tax regulations',
])

GDPR Deletion Checklist

  • ✅ Deletion request intake: logged, deadline tracked, acknowledged to user
  • ✅ Primary database: user PII deleted or anonymized
  • ✅ Caches (Redis, CDN): user data cleared
  • ✅ File storage (S3): user uploads deleted
  • ✅ Third-party services: documented and automated where possible
  • ✅ Analytics data: pseudonymized or deleted
  • ✅ Backups: policy documented — new backups won't include the user after deletion
  • ✅ Deletion audit log: proof of completion stored
  • ✅ User notified of completion (required by GDPR)

Conclusion

GDPR deletion compliance is a data architecture problem masquerading as a legal problem. The companies that handle it easily designed their systems with a clear separation between PII and non-PII data, used pseudonymization in analytics, and built deletion automation from the start. Everyone else faces a 30-day treasure hunt through every system they've ever connected to. The fix for the future is to make deletion a first-class operation in your data model — PII in a dedicated table, analytics using pseudonyms, and a deletion orchestrator that knows about every data store. The fix for today is to build the orchestrator before the first request arrives.