CockroachDB for Global Applications — Distributed SQL Without the Distributed Systems Expertise

Introduction

CockroachDB brings distributed SQL to applications without requiring deep distributed systems expertise. Unlike PostgreSQL (single region) or ClickHouse (analytics-only), CockroachDB provides ACID transactions across regions, automatic replication, and transparent failover. This guide covers production CockroachDB deployments: when to choose it, table localities for latency optimization, and handling transaction retries in application code.

When to Choose CockroachDB vs PostgreSQL
Multi-Region Table Localities
Follower Reads for Low-Latency Reads
Transaction Retries and Retry Loops
Schema Design for Multi-Region
Survivability Goals and Zone Configuration
SHOW RANGES for Debugging Locality
Checklist
Conclusion

When to Choose CockroachDB vs PostgreSQL

Understanding the tradeoff between consistency and operational complexity guides technology selection.

Decision Matrix:

PostgreSQL:
- Single region, simple operations
- Strong consistency, ACID guaranteed
- Standard SQL, no learning curve
- Familiar tooling (pg_dump, pg_restore)
- Scaling: read replicas only (no write scaling)

CockroachDB:
- Multi-region, geographically distributed
- Strong consistency across regions (ACID)
- Distributed SQL (mostly compatible with Postgres)
- Automatic replication and failover
- Scaling: write scaling across regions

ClickHouse/Elasticsearch:
- Analytics/search, immutable data
- Eventual consistency acceptable
- Complex aggregations (ClickHouse) or relevance (ES)
- Separate from transactional database

Cloud Spanner (GCP):
- Multi-region, strong consistency
- Managed service (no ops burden)
- Expensive ($$$)
- Vendor lock-in

Vitess (MySQL sharding):
- MySQL compatibility
- Horizontal sharding (application-aware)
- Complex operational model
- Good for existing MySQL shops

Real-world scenarios:

Scenario 1: SaaS with European + US customers
- PostgreSQL: Read replicas in each region (eventual consistency for reads)
- CockroachDB: Seamless failover, strong consistency globally
- Decision: CockroachDB (ACID across regions is critical)

Scenario 2: Analytics pipeline (events → warehouse)
- PostgreSQL: OLTP database, separate analytics pipeline
- CockroachDB: Acceptable but over-engineered for analytics
- Decision: PostgreSQL + ClickHouse (separation of concerns)

Scenario 3: Fintech (accounts, transactions)
- PostgreSQL: Single region only (regulatory, latency requirements)
- CockroachDB: Multiple regions with strong consistency
- Decision: CockroachDB (ACID mandatory, global redundancy required)

Scenario 4: Startup MVP
- PostgreSQL: Simplicity, maturity
- CockroachDB: Complexity not needed yet
- Decision: PostgreSQL (migration path to CockroachDB later)

Multi-Region Table Localities

CockroachDB's table localities determine data placement and latency.

-- LOCALITY 1: GLOBAL (replicated everywhere, always local reads)
-- Use for: Reference data, lookup tables (rare writes)
CREATE TABLE countries (
  id INT PRIMARY KEY,
  name STRING,
  code STRING UNIQUE
) LOCALITY GLOBAL;

-- Explanation: Every region has full copy
-- Read: Always local (0ms latency)
-- Write: Coordinated across regions (higher latency)

-- LOCALITY 2: REGIONAL BY ROW (data partitioned by region)
-- Use for: User data, customer records (region-specific)
CREATE TABLE orders (
  id UUID PRIMARY KEY,
  crdb_region crdb_internal_region,
  customer_id UUID,
  amount DECIMAL,
  created_at TIMESTAMP DEFAULT NOW()
) LOCALITY REGIONAL BY ROW;

-- Explanation: Each order lives in one region (via crdb_region column)
-- Read: Local to region (0ms latency if same region)
-- Write: Local to region (0ms latency)
-- Failure: If region down, that region's data unavailable

-- Add region column automatically (for new tables)
ALTER TABLE orders CONFIGURE ZONE USING num_replicas = 3;

-- LOCALITY 3: REGIONAL (fully replicated within region)
-- Use for: Tables needed by entire region but not others
CREATE TABLE regional_configs (
  id INT PRIMARY KEY,
  config_key STRING,
  config_value STRING
) LOCALITY REGIONAL;

-- Explanation: Every replica is in home region only
-- Read: Always local (0ms latency)
-- Write: Coordinated within region
-- Failure: If home region down, data unavailable

-- View table locality settings
SELECT
  table_name,
  locality
FROM pg_class
LEFT JOIN information_schema.tables ON oid = table_name::regclass
WHERE table_schema = 'public';

-- Update table locality
ALTER TABLE orders SET LOCALITY REGIONAL BY ROW;
ALTER TABLE countries SET LOCALITY GLOBAL;

Follower Reads for Low-Latency Reads

Follower reads serve stale data from local replicas, eliminating cross-region latency.

-- Default: Strong consistency (reads from leader/leaseholder, may cross region)
SELECT * FROM orders WHERE customer_id = '12345';

-- Follower read: Read from any replica (stale data, local latency)
SELECT * FROM orders AS OF SYSTEM TIME follower_read_timestamp()
WHERE customer_id = '12345';

-- Explanation:
-- Strong read: 100ms (cross-region if replica in different region)
-- Follower read: 5ms (local replica, always available)

-- Tradeoff: Stale data (typically < 50ms stale)

-- Follower reads with bounded staleness
SELECT * FROM orders AS OF SYSTEM TIME BOUNDED STALENESS '30s'
WHERE customer_id = '12345';

-- Use follower reads for:
-- - Analytics queries (freshness not critical)
-- - Read-heavy workloads (cache warming)
-- - Non-critical reads (reporting)

-- Avoid follower reads for:
-- - Transactional reads (must be fresh)
-- - Read-after-write consistency (must wait for commit)

-- Application-level follower read pattern
function getPreviousOrdersAnalytical(customerId) {
  // Follower read: acceptable staleness for analytics
  return db.query(
    `SELECT * FROM orders AS OF SYSTEM TIME follower_read_timestamp()
     WHERE customer_id = $1 AND created_at > NOW() - INTERVAL '90 days'`,
    [customerId]
  );
}

function getLatestOrder(customerId) {
  // Strong read: must be current
  return db.query(
    `SELECT * FROM orders WHERE customer_id = $1 ORDER BY created_at DESC LIMIT 1`,
    [customerId]
  );
}

Transaction Retries and Retry Loops

Network partitions and transaction conflicts cause retries. Proper retry logic is essential.

// Node.js: Handle CockroachDB transaction retries
import pg from 'pg';

const pool = new pg.Pool({
  host: 'crdb-cluster',
  database: 'myapp',
  // Retry configuration
  max: 20,
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 5000,
});

// Retry decorator with exponential backoff
function withRetry(fn: () => Promise<any>, maxRetries = 5) {
  return async function executeWithRetry() {
    let lastError: Error | null = null;

    for (let attempt = 0; attempt < maxRetries; attempt++) {
      try {
        return await fn();
      } catch (err: any) {
        lastError = err;

        // Check if error is retryable
        if (!isRetryableError(err)) {
          throw err;  // Not retryable, fail immediately
        }

        // Exponential backoff: 1ms, 2ms, 4ms, 8ms, 16ms
        const backoffMs = Math.min(1000, Math.pow(2, attempt));
        console.log(`Retry attempt ${attempt + 1}/${maxRetries}, backoff ${backoffMs}ms`);

        await new Promise(resolve => setTimeout(resolve, backoffMs));
      }
    }

    throw new Error(`Failed after ${maxRetries} retries: ${lastError?.message}`);
  };
}

function isRetryableError(err: any): boolean {
  // CockroachDB retryable error codes
  const retryableErrors = [
    '40001',  // serialization_failure (transaction conflict)
    '40P01',  // deadlock_detected
    '57P03',  // cannot_connect_now (node down, will retry)
    '42601',  // syntax error (usually not retryable, but check)
  ];

  return retryableErrors.includes(err.code);
}

// Transaction with automatic retry
async function transferFunds(fromId: string, toId: string, amount: number) {
  const executeTransfer = async () => {
    const client = await pool.connect();
    try {
      await client.query('BEGIN');

      // Debit from account
      await client.query(
        `UPDATE accounts SET balance = balance - $1 WHERE id = $2`,
        [amount, fromId]
      );

      // Credit to account
      await client.query(
        `UPDATE accounts SET balance = balance + $1 WHERE id = $2`,
        [amount, toId]
      );

      // Log transaction
      await client.query(
        `INSERT INTO transfers (from_id, to_id, amount, status) VALUES ($1, $2, $3, 'completed')`,
        [fromId, toId, amount]
      );

      await client.query('COMMIT');
      return { success: true };
    } catch (err) {
      await client.query('ROLLBACK');
      throw err;
    } finally {
      client.release();
    }
  };

  return withRetry(executeTransfer)();
}

// Idempotent transaction (safe to retry)
async function idempotentTransfer(
  transferId: string,
  fromId: string,
  toId: string,
  amount: number
) {
  const execute = async () => {
    const client = await pool.connect();
    try {
      await client.query('BEGIN');

      // Check if already processed (idempotency key)
      const existing = await client.query(
        `SELECT id FROM transfers WHERE transfer_id = $1`,
        [transferId]
      );

      if (existing.rows.length > 0) {
        // Already processed, safe to return
        return { success: true, idempotent: true };
      }

      // Perform transfer
      await client.query(
        `UPDATE accounts SET balance = balance - $1 WHERE id = $2`,
        [amount, fromId]
      );

      await client.query(
        `UPDATE accounts SET balance = balance + $1 WHERE id = $2`,
        [amount, toId]
      );

      // Record with idempotency key
      await client.query(
        `INSERT INTO transfers (transfer_id, from_id, to_id, amount, status)
         VALUES ($1, $2, $3, $4, 'completed')`,
        [transferId, fromId, toId, amount]
      );

      await client.query('COMMIT');
      return { success: true };
    } catch (err) {
      await client.query('ROLLBACK');
      throw err;
    } finally {
      client.release();
    }
  };

  return withRetry(execute)();
}

Schema Design for Multi-Region

Table design determines transaction locality and latency.

-- MULTI-REGION SCHEMA: Global app with per-region data

-- Reference data (global, static)
CREATE TABLE currencies (
  code STRING PRIMARY KEY,
  name STRING,
  symbol STRING
) LOCALITY GLOBAL;

-- User data (regional, partitioned by user's region)
CREATE TABLE users (
  id UUID PRIMARY KEY,
  crdb_region CRDB_INTERNAL_REGION,
  email STRING UNIQUE,
  name STRING,
  home_region STRING,
  created_at TIMESTAMP DEFAULT NOW()
) LOCALITY REGIONAL BY ROW;

-- Transactions (regional, lives in user's region)
CREATE TABLE transactions (
  id UUID PRIMARY KEY,
  crdb_region CRDB_INTERNAL_REGION,
  user_id UUID,
  amount DECIMAL,
  description STRING,
  created_at TIMESTAMP DEFAULT NOW(),
  FOREIGN KEY (user_id) REFERENCES users(id)
) LOCALITY REGIONAL BY ROW;

-- Indexes (optimized for regional queries)
CREATE INDEX idx_transactions_user_created
  ON transactions (user_id, created_at DESC)
  WHERE crdb_region = 'us-east';

-- Avoid: Storing customer region ID (creates large cross-region FK)
-- BAD: customers table with region_id foreign key
-- GOOD: crdb_region column (implicit, for region-specific data)

-- Query locality analysis
EXPLAIN SELECT * FROM transactions WHERE user_id = '12345';
-- If crdb_region matches current region: "local scan"
-- If cross-region: "remote scan" (latency penalty)

-- View region assignments
SELECT
  table_name,
  crdb_region,
  count(*) as row_count
FROM transactions
GROUP BY table_name, crdb_region
ORDER BY crdb_region, row_count DESC;

Survivability Goals and Zone Configuration

Define how many region failures your application can survive.

-- SURVIVABILITY GOAL 1: Region Failure
-- Survive loss of 1 region (3-region cluster)
-- Configuration: 3 replicas, 1 per region
ALTER DATABASE myapp CONFIGURE ZONE USING num_replicas = 3;

-- Query: What regions have our data?
SELECT
  zone_configuration
FROM system.zones
WHERE database_name = 'myapp';

-- SURVIVABILITY GOAL 2: Node Failure
-- Survive loss of 1 node (each region has 3+ nodes)
-- Configuration: 3 replicas within each region
ALTER TABLE transactions CONFIGURE ZONE USING
  num_replicas = 3,
  constraints = '[+region=us-east]';

-- SURVIVABILITY GOAL 3: Zone Failure (within region)
-- Survive loss of 1 availability zone
-- Configuration: replicas spread across 3 AZs
ALTER TABLE transactions CONFIGURE ZONE USING
  num_replicas = 3,
  constraints = '[+zone=us-east-1a,+zone=us-east-1b,+zone=us-east-1c]';

-- REPLICATION FACTOR GUIDANCE
-- num_replicas = 1: No redundancy (development only)
-- num_replicas = 3: Survive 1 failure (standard)
-- num_replicas = 5: Survive 2 failures (high availability)

-- Verify zone configuration
SELECT
  zone_id,
  zone_name,
  config
FROM system.zones
WHERE database_name = 'myapp'
ORDER BY zone_name;

-- Monitor replica distribution
SELECT
  range_id,
  replica_count,
  replicas
FROM system.ranges
WHERE database_name = 'myapp'
LIMIT 10;

SHOW RANGES for Debugging Locality

Understand how data is distributed with SHOW RANGES.

-- View range distribution (where data physically lives)
SHOW RANGES FROM TABLE orders;

-- Output:
-- range_id | start_key | end_key | leader_region | replicas
-- 1        | NULL      | /100   | us-east       | [us-east, eu-west, us-west]
-- 2        | /100      | /200   | eu-west       | [eu-west, us-east, us-west]

-- Explanation:
-- - range_id: Unique identifier for data range
-- - leader_region: Leaseholder (primary, processes reads/writes)
-- - replicas: All regions storing copies
-- - leader_region not in replicas = cross-region leader (bad for latency)

-- Debugging: Find ranges with suboptimal placement
SELECT
  range_id,
  leader_region,
  array_agg(replica_region) as replicas,
  (array_agg(replica_region) @> ARRAY[leader_region])::boolean as leader_is_replica
FROM (
  SELECT range_id, 'us-east' as leader_region, replica_region
  FROM system.ranges WHERE database_name = 'myapp'
) t
GROUP BY range_id, leader_region
HAVING NOT (array_agg(replica_region) @> ARRAY[leader_region])
LIMIT 10;

-- Rebalance (if leader not in replicas)
-- CockroachDB automatically rebalances, but monitor progress:
SELECT
  zone_name,
  rebalance_score,
  rebalance_status
FROM system.zones
WHERE database_name = 'myapp';

Checklist

Conclusion

CockroachDB brings distributed SQL and operational simplicity to global applications. Design tables with appropriate localities (GLOBAL for reference data, REGIONAL BY ROW for user data), use follower reads for analytical workloads, and implement robust retry logic with idempotency keys. Monitor range distribution with SHOW RANGES to ensure data placement matches your latency requirements, and configure zone constraints based on your survivability goals. With careful schema design and proper handling of distributed transactions, CockroachDB delivers strong consistency and automatic failover across regions without the operational complexity of manual sharding.