Serverless Patterns in Production — Cold Starts, State Management, and When Lambda Fails You

Introduction

AWS Lambda is convenient but hides complexity. Cold starts penalty developers 100-500ms on Java, 50-100ms on Node.js. State management is tricky without persistent connections. This post covers optimization patterns, idempotent design, SQS integration, and cost pitfalls.

Lambda Cold Start Optimization
Lambda Execution Environment Reuse
Idempotent Lambda Handlers
Lambda with SQS (Batch Processing, Partial Failure)
Lambda Power Tuning for Cost/Performance
Lambda Layers for Shared Code
When Serverless Costs More Than EC2
Checklist
Conclusion

Lambda Cold Start Optimization

Cold starts happen when Lambda instantiates a new execution environment. Minimize them by:

Reduce bundle size (< 50MB)
Initialize expensive resources outside handler
Use provisioned concurrency (warm lambdas, costs money)
Use Lambda SnapStart (Java only, reuses JVM state)

// handler.ts - Optimized cold start pattern
import { APIGatewayProxyHandler, APIGatewayProxyResult } from 'aws-lambda';

// GOOD: Initialize connections once, reuse across invocations
let dbConnection: any = null;
let redisClient: any = null;

async function getDbConnection() {
  if (!dbConnection) {
    // This runs once per warm container
    dbConnection = await initializeDatabase();
  }
  return dbConnection;
}

async function getRedisClient() {
  if (!redisClient) {
    redisClient = await initializeRedis();
  }
  return redisClient;
}

// Handler runs on every invocation
export const handler: APIGatewayProxyHandler = async (
  event,
  context
): Promise<APIGatewayProxyResult> => {
  try {
    // Reuse initialized connections (warm container advantage)
    const db = await getDbConnection();
    const redis = await getRedisClient();

    // Log remaining execution time for cold start detection
    console.log(`Handler executed in ${context.getRemainingTimeInMillis()}ms`);

    // Handler logic
    const result = await processRequest(event, db, redis);

    return {
      statusCode: 200,
      body: JSON.stringify(result),
    };
  } catch (error) {
    console.error('Error:', error);
    return {
      statusCode: 500,
      body: JSON.stringify({ error: 'Internal server error' }),
    };
  }
};

async function initializeDatabase() {
  // Expensive initialization
  return { query: async (sql: string) => {} };
}

async function initializeRedis() {
  // Expensive initialization
  return { get: async (key: string) => null };
}

async function processRequest(event: any, db: any, redis: any) {
  return { message: 'OK' };
}

Bundle optimization:

# Minify and tree-shake dependencies
esbuild src/handler.ts --bundle --minify --platform=node --format=esm --outfile=dist/handler.js

# Result: 2MB bundle (vs 15MB unoptimized)

Provisioned Concurrency:

// Keep N instances warm 24/7
// Costs: ~$0.015 per GB-hour
// Use for: critical paths, consistent traffic

// AWS CLI
aws lambda put-provisioned-concurrency-config \
  --function-name my-function \
  --provisioned-concurrent-executions 50

Lambda Execution Environment Reuse

Lambda reuses execution environments. Code outside the handler runs once; code in the handler runs every invocation. BUT: connections must be properly pooled.

// WRONG: New connection per invocation (slow, exhausts DB)
export const badHandler: APIGatewayProxyHandler = async (event) => {
  const client = new PgClient();
  await client.connect();
  const result = await client.query('SELECT * FROM users');
  await client.end(); // Close after every request
  return { statusCode: 200, body: JSON.stringify(result) };
};

// CORRECT: Connection pool reused across invocations
import Pool from 'pg-pool';

// Pool created once
const pool = new Pool({
  host: process.env.DB_HOST,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,
  database: process.env.DB_NAME,
  max: 10, // Max connections
  idleTimeoutMillis: 30000,
});

// Handler reuses pooled connections
export const goodHandler: APIGatewayProxyHandler = async (event) => {
  // Get connection from pool (fast, no overhead)
  const client = await pool.connect();
  try {
    const result = await client.query('SELECT * FROM users LIMIT 10');
    return { statusCode: 200, body: JSON.stringify(result.rows) };
  } finally {
    client.release(); // Return to pool
  }
};

Idempotent Lambda Handlers

Lambda can be invoked multiple times (retries, duplicates). Handlers must be idempotent: same input → same result, regardless of invocation count.

// idempotent-handler.ts - Charge payment idempotently
import crypto from 'crypto';

interface ChargeRequest {
  customerId: string;
  amount: number;
  transactionId: string; // Unique per charge attempt
}

// Store of processed transactions (DynamoDB in production)
const processedTransactions = new Map<string, { status: string; result: any }>();

export async function chargePaymentHandler(event: ChargeRequest) {
  const idempotencyKey = `charge:${event.transactionId}`;

  // Check if already processed
  const cached = processedTransactions.get(idempotencyKey);
  if (cached) {
    console.log(`Idempotent return for ${idempotencyKey}`);
    return cached.result;
  }

  try {
    // Call payment provider with idempotency key
    const response = await chargeWithProvider({
      customerId: event.customerId,
      amount: event.amount,
      idempotencyKey: event.transactionId, // Provider deduplicates
    });

    const result = {
      status: 'success',
      transactionId: response.id,
      chargedAmount: response.amount,
    };

    // Cache result
    processedTransactions.set(idempotencyKey, {
      status: 'success',
      result,
    });

    return result;
  } catch (error) {
    // Don't cache errors; allow retry
    throw error;
  }
}

async function chargeWithProvider(params: any) {
  // Call payment API with idempotency key
  const response = await fetch('https://payment-provider.com/charge', {
    method: 'POST',
    headers: {
      'Idempotency-Key': params.idempotencyKey,
    },
    body: JSON.stringify({
      customerId: params.customerId,
      amount: params.amount,
    }),
  });

  if (!response.ok) throw new Error('Payment failed');
  return response.json();
}

Lambda with SQS (Batch Processing, Partial Failure)

Integrate Lambda with SQS for durable, scalable batch processing. Handle partial failures gracefully.

// sqs-batch-handler.ts - Process SQS messages
import { SQSHandler, SQSRecord } from 'aws-lambda';

export const handler: SQSHandler = async (event) => {
  const results: {
    itemId: string;
    status: 'success' | 'failed';
  }[] = [];

  // Process each message
  for (const record of event.Records) {
    try {
      const message = JSON.parse(record.body);
      await processMessage(message);
      results.push({ itemId: record.messageId, status: 'success' });
    } catch (error) {
      console.error(`Failed to process message ${record.messageId}:`, error);
      results.push({ itemId: record.messageId, status: 'failed' });
      // Don't throw: process remaining messages
    }
  }

  // Return partial failure response
  // SQS automatically retries failed messages
  const failedIds = results
    .filter(r => r.status === 'failed')
    .map(r => r.itemId);

  return {
    batchItemFailures: failedIds.map(id => ({
      itemId: id,
    })),
  };
};

async function processMessage(message: any) {
  console.log(`Processing message: ${JSON.stringify(message)}`);
  // Simulate work
  await new Promise(r => setTimeout(r, 100));
}

SQS configuration:

# serverless.yml - SQS queue trigger
functions:
  batchProcessor:
    handler: sqs-batch-handler.handler
    events:
      - sqs:
          arn:
            Fn::GetAtt: [OrderQueue, Arn]
          batchSize: 10
          batchWindow: 5 # Aggregate for 5 seconds
          maximumRetryAttempts: 2

Lambda Power Tuning for Cost/Performance

Lambda billing is (GB × seconds). Higher memory = faster execution × higher cost. Find optimal memory:

// powertune-analysis.ts - Analyze cost/performance
// Use AWS Lambda Power Tuning tool

// Typical cost curves:
// 128 MB: 100 seconds, $0.002
// 256 MB: 50 seconds, $0.001
// 512 MB: 30 seconds, $0.0015
// 1024 MB: 20 seconds, $0.002

// Optimal: 256-512 MB (best cost per execution)

Lambda Layers for Shared Code

Lambda Layers bundle shared code, dependencies, or libraries. Deploy once, reference from multiple functions.

# Create layer
mkdir nodejs
npm install --save-prod some-lib
zip -r lambda-layer.zip nodejs/

aws lambda publish-layer-version \
  --layer-name shared-dependencies \
  --zip-file fileb://lambda-layer.zip \
  --compatible-runtimes nodejs20.x

# Reference in function
aws lambda update-function-configuration \
  --function-name my-function \
  --layers arn:aws:lambda:us-east-1:123456789:layer:shared-dependencies:1

When Serverless Costs More Than EC2

Serverless has hidden costs:

Cold starts: Extra latency (100-500ms)
Execution time: Billed per 100ms (always round up)
Data transfer: Egress costs spike with high volume
Provisioned concurrency: $0.015/GB-hour (not cheaper than EC2)
Always-running services: Daily cost = (GB × hours × $0.0000166)

// Cost comparison: Lambda vs EC2

// Lambda: 1M requests/month, 100ms average
// Memory: 256 MB
// Cost: (1M × 0.1s × 256/1024 × $0.0000000167) = $40/month
// + Data transfer: 500GB egress × $0.09 = $45/month
// Total: ~$85/month

// EC2 (t3.small, always-on):
// Instance: $0.021/hour × 720 hours = $15.12/month
// Data transfer: 500GB × $0.09 = $45/month
// Total: ~$60/month

// EC2 wins for always-running workloads

Use Lambda for:

Occasional requests (< 100 req/day)
Variable traffic (spiky)
Event-driven (S3, DynamoDB, SNS)

Use EC2 for:

Always-running services
Consistent baseline traffic
Long-running processes (> 15 minutes)

Checklist

Optimize bundle size to < 50MB (minify, tree-shake)
Initialize expensive resources outside handler
Reuse connections via pooling
Implement idempotent handlers with request deduplication
Handle partial SQS failures gracefully
Use Power Tuning to find cost-optimal memory
Use Layers for shared dependencies
Monitor cold starts via CloudWatch metrics
Set appropriate timeouts (default 3s, often too short)
Calculate true costs: compute + data transfer + storage

Conclusion

Lambda excels at event-driven, low-frequency workloads. Cold starts, connection management, and idempotency are critical in production. For always-running services, EC2 is often cheaper. Understand your traffic pattern before defaulting to serverless.