Product Launch With No Load Testing — When the Press Release Causes the Outage

Sanjeev SharmaSanjeev Sharma
7 min read

Advertisement

Introduction

Product launches are the most dangerous moment in a startup's engineering life. Marketing has promised the world, press has been lined up, and the team is exhausted from shipping the feature. What nobody has done is ask: "What happens if 50,000 people try to sign up in the first hour?" Load testing before a launch is not a luxury — it's the difference between a successful launch story and a "company couldn't handle demand" headline that follows you for years.

The Launch Traffic Pattern

Typical product launch traffic curve:

T-0:    TechCrunch / Product Hunt article goes live
T+5min: First wave from social sharing — 5x normal
T+15min: Second wave from retweets — 15x normal
T+45min: Third wave from email newsletters — 30x normal
T+2hr:  HN front page pick-up (if you're lucky/unlucky) — 50x normal
T+6hr:  European audience wakes up — sustained 20x
T+24hr: Traffic normalizes to 3-5x new baseline

What each spike tests:
- Signup endpoint: absolutely hammered (everyone tries to sign up at once)
- Marketing pages: highest absolute traffic
- Email verification: 50,000 emails in 1 hour → SendGrid rate limits
- Database: every new user = row write + N reads
- Session storage: 50,000 new sessions simultaneously

Fix 1: Load Test at Launch Scale, Not Normal Scale

// launch-loadtest.js — k6 test modeling launch day traffic
import http from 'k6/http'
import { sleep, check, group } from 'k6'

export const options = {
  scenarios: {
    // Simulate the TechCrunch spike
    launch_spike: {
      executor: 'ramping-vus',
      startVUs: 10,
      stages: [
        { duration: '2m', target: 500 },   // Rapid spike
        { duration: '5m', target: 2000 },  // Peak (50x normal)
        { duration: '5m', target: 2000 },  // Sustained
        { duration: '3m', target: 500 },   // Drop
      ],
      thresholds: {
        http_req_duration: ['p(99)<3000'],  // 99% under 3s
        http_req_failed: ['rate<0.01'],     // <1% errors
      },
    },
  },
}

export default function () {
  group('Signup flow (most critical at launch)', () => {
    // Step 1: Load landing page
    const landing = http.get('https://staging.myapp.com/')
    check(landing, { 'landing: status 200': r => r.status === 200 })

    // Step 2: Submit signup
    const signup = http.post('https://staging.myapp.com/api/auth/signup', JSON.stringify({
      email: `loadtest+${Date.now()}+${Math.random().toString(36)}@example.com`,
      password: 'TestPass123!',
    }), { headers: { 'Content-Type': 'application/json' } })

    check(signup, {
      'signup: status 201': r => r.status === 201,
      'signup: fast': r => r.timings.duration < 2000,
    })
  })

  sleep(Math.random() * 2)
}

Fix 2: The Launch Checklist — 2 Weeks Before

// launch-readiness.ts — audit before every major launch
interface LaunchCheck {
  category: string
  check: string
  status: 'pass' | 'fail' | 'warn'
  notes?: string
}

async function runLaunchReadinessChecks(): Promise<LaunchCheck[]> {
  return [
    // Capacity checks
    {
      category: 'Capacity',
      check: 'Auto-scaling configured and tested',
      status: await isAutoScalingConfigured() ? 'pass' : 'fail',
    },
    {
      category: 'Capacity',
      check: 'Pre-warm scheduled for launch time',
      status: await hasPreWarmScheduled() ? 'pass' : 'warn',
      notes: 'Scale to 5x capacity 30 minutes before launch',
    },
    {
      category: 'Database',
      check: 'Connection pooler (PgBouncer) configured',
      status: await isPgBouncerRunning() ? 'pass' : 'fail',
    },
    {
      category: 'Database',
      check: 'Slow query log reviewed, indexes in place',
      status: await hasReviewedSlowQueries() ? 'pass' : 'warn',
    },
    {
      category: 'Email',
      check: 'SendGrid rate limit will not be hit at launch scale',
      status: await checkEmailRateLimit() ? 'pass' : 'fail',
      notes: 'SendGrid free tier: 100/day. Pro tier: 100/second. Calculate expected signups.',
    },
    {
      category: 'Monitoring',
      check: 'Alerts configured for error rate, latency, DB connections',
      status: await hasAlertsConfigured() ? 'pass' : 'fail',
    },
    {
      category: 'Operations',
      check: 'War room scheduled — engineers available during launch window',
      status: await hasWarRoomScheduled() ? 'pass' : 'warn',
    },
    {
      category: 'Rollback',
      check: 'Rollback plan documented and tested',
      status: await hasRollbackPlan() ? 'pass' : 'fail',
    },
  ]
}

Fix 3: Email Rate Limit Planning

// Email is the most-missed launch scaling bottleneck
// 10,000 signups in 1 hour = 167 emails/minute

// Check your email provider's limits before launch:
// SendGrid free: 100/day → FAILS immediately
// SendGrid Essentials: 100/hour → fails at 2,000 signups/hour
// SendGrid Pro: 1,500/hour → handle ~90,000 signups before throttling

// Queue emails and rate-limit your own sending
import Bull from 'bull'

const emailQueue = new Bull('email', {
  redis: process.env.REDIS_URL,
  limiter: {
    max: 100,       // Max 100 jobs processed
    duration: 60000, // Per 60 seconds
  },
})

// Also: defer non-critical emails during the spike
async function sendWelcomeEmail(userId: string, email: string): Promise<void> {
  // Don't block signup on email — queue it
  await emailQueue.add(
    { type: 'welcome', userId, email },
    {
      delay: Math.random() * 30000,  // Spread over 30 seconds to flatten burst
      attempts: 3,
      backoff: { type: 'exponential', delay: 5000 },
    }
  )
}

// Critical emails (email verification): send immediately, bypass queue
// Non-critical (onboarding series): queue with 24h delay

Fix 4: Feature Flag to Shed Load at Peak

// Launch day: protect the critical path (signup, login)
// by shedding non-critical features under high load

import { getFlag } from './feature-flags'

async function getHomepageData(userId?: string): Promise<HomepageData> {
  const isUnderHighLoad = await getFlag('high-load-mode')

  // Under high load: return minimal page, no recommendations
  if (isUnderHighLoad) {
    return {
      hero: await getHeroContent(),   // Critical: cached
      features: STATIC_FEATURES,       // Critical: static
      // No personalized recommendations — DB too busy
      // No social proof — third-party API bypassed
      // No A/B test variants — feature flag returns default
    }
  }

  return {
    hero: await getHeroContent(),
    features: await getFeatures(userId),
    recommendations: await getRecommendations(userId),
    socialProof: await getSocialProof(),
  }
}

// Flip 'high-load-mode' to true if things get rough — instant effect

Fix 5: War Room Protocol for Launch Day

// launch-war-room.ts — real-time dashboard for launch day
async function launchDayMonitor() {
  const LAUNCH_TIME = new Date('2026-03-15T14:00:00Z')
  const MONITOR_DURATION_HOURS = 4

  const dashboard = {
    signupsLastMinute: 0,
    errorRateLastMinute: 0,
    p95LatencyMs: 0,
    activeUsers: 0,
    dbConnections: 0,
  }

  const interval = setInterval(async () => {
    const [signups, errors, latency, sessions, dbConns] = await Promise.all([
      getMetric('signups', '1m'),
      getMetric('error_rate', '1m'),
      getMetric('p95_latency', '5m'),
      getMetric('active_sessions', 'current'),
      getMetric('db_connections', 'current'),
    ])

    // Log status every minute
    console.log(`[${new Date().toISOString()}] Launch Status:`)
    console.log(`  Signups/min:    ${signups}`)
    console.log(`  Error rate:     ${(errors * 100).toFixed(2)}%`)
    console.log(`  P95 latency:    ${latency}ms`)
    console.log(`  Active users:   ${sessions}`)
    console.log(`  DB connections: ${dbConns}`)

    // Auto-scale if needed
    if (errors > 0.02) {  // >2% error rate
      await alerting.critical('Launch error rate > 2% — consider activating high-load-mode')
    }
  }, 60_000)

  setTimeout(() => clearInterval(interval), MONITOR_DURATION_HOURS * 3600 * 1000)
}

Launch Readiness Checklist

  • ✅ Load tested at 5x and 10x expected launch traffic — with real endpoints
  • ✅ Email provider plan supports expected signup volume per hour
  • ✅ Auto-scaling configured and pre-warm scheduled 30 minutes before launch
  • ✅ Database connection pool sized for launch-scale concurrent users
  • ✅ Feature flag ready to shed non-critical features under load
  • ✅ War room scheduled — engineers on standby during peak window
  • ✅ Rollback plan exists and is tested (can you roll back at launch peak?)
  • ✅ Status page ready — communicate proactively if issues occur

Conclusion

A product launch is a planned traffic spike. The teams that execute them well treat it like a deployment: load test at the expected scale, have engineers on standby, pre-warm capacity, and have a clear protocol for when things go wrong. The outage that happens at launch isn't bad luck — it's the result of never asking "what if 50,000 people arrive in the first hour?" Ask that question two weeks before launch, when there's time to fix the answers.

Advertisement

Sanjeev Sharma

Written by

Sanjeev Sharma

Full Stack Engineer · E-mopro