Published on

Redis Cluster in Production — Sharding, Failover, and the Gotchas No One Warns You About

Authors

Introduction

Redis Cluster distributes data across multiple nodes using consistent hashing on 16,384 slots. However, Cluster mode introduces subtle gotchas: CROSSSLOT errors when querying related data, resharding complexity, and replication lag during failovers. This guide covers production Cluster deployments, from slot allocation to handling multi-key operations safely.

Redis Cluster Architecture: 16384 Slots and Consistent Hashing

Redis Cluster partitions data using consistent hashing into 16,384 slots. Each node owns a subset of slots.

Slot Distribution (Example: 6 nodes, 3 masters + 3 replicas)

Master-0: slots 0-5461       Replica-0: replica of Master-0
Master-1: slots 5462-10922   Replica-1: replica of Master-1
Master-2: slots 10923-16383  Replica-2: replica of Master-2

Hashing:
KEYSLOT = CRC16(key) % 16384

Example:
KEY "user:123"CRC16("user:123") → slot 7890 → stored on Master-1
KEY "order:456"CRC16("order:456") → slot 12000 → stored on Master-2
# redis.conf cluster configuration
port 6379
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 15000  # 15 seconds
cluster-migration-barrier 1
cluster-require-full-coverage yes  # require all slots assigned
timeout 0
tcp-backlog 511
daemonize no
databases 16

# Persistence
save 900 1        # save if 1 key changed in 900 seconds
save 300 10       # save if 10 keys changed in 300 seconds
save 60 10000     # save if 10000 keys changed in 60 seconds
appendonly yes
appendfsync everysec

# Replication
repl-diskless-sync no
repl-diskless-sync-delay 5

# Max memory policy
maxmemory 2gb
maxmemory-policy allkeys-lru  # evict LRU keys when full

# Networking
bind 127.0.0.1
protected-mode no
# cluster-announce-ip ACTUAL_IP (for Docker/K8s)
#!/bin/bash
# Create Redis Cluster with 6 nodes (3 masters, 3 replicas)

NODES=6
PORT_START=7000

# Start Redis instances
for i in $(seq 0 $((NODES-1))); do
  PORT=$((PORT_START + i))
  echo "Starting Redis on port $PORT"
  redis-server --port $PORT --cluster-enabled yes \
    --cluster-config-file /data/nodes-$PORT.conf \
    --daemonize yes \
    --logfile /var/log/redis-$PORT.log
done

# Wait for all nodes to start
sleep 2

# Create cluster (automatic replica assignment)
redis-cli --cluster create localhost:7000 localhost:7001 localhost:7002 \
  localhost:7003 localhost:7004 localhost:7005 \
  --cluster-replicas 1  # 1 replica per master
  --cluster-yes

# Verify cluster status
redis-cli -p 7000 cluster info
redis-cli -p 7000 cluster nodes

Consistent Hashing and Slot Assignment

Understanding slot assignment is critical for resharding and rebalancing.

CRC16 Hash Space: 0 to 2^16-1 (65535)
Redis Cluster Slots: 16384 (power of 2, easier partitioning)

Hash Collision Example:
Two different keys can hash to same slot (by design, enables atomic operations on hash)
"user:123"CRC16 hash → slot 100
"user:124"CRC16 hash → slot 100 (same slot, different key)

Slot Ownership Query:
Every key belongs to exactly one slot.
Which node owns slot 100?
// ioredis automatically handles slot calculation
import Redis from 'ioredis';

// Single node (not cluster-aware)
const singleNodeRedis = new Redis({
  host: 'localhost',
  port: 6379,
});

// Cluster mode (cluster-aware client)
const cluster = new Redis.Cluster(
  [
    { host: 'redis-1', port: 6379 },
    { host: 'redis-2', port: 6379 },
    { host: 'redis-3', port: 6379 },
  ],
  {
    dnsLookup: (address, callback) => {
      // Resolve cluster node IPs (important for K8s/Docker)
      callback(null, address, 4);
    },
    enableReadyCheck: false,
    enableOfflineQueue: true,
    maxRedirections: 16,  // max MOVED redirects before giving up
    retryDelayOnFailover: 100,
    retryDelayOnClusterDown: 300,
    slotsRefreshTimeout: 2000,
    dnsLookupServiceSRV: false,
    skipUnavailableNodes: false,
    nodeConnectionPool: {
      minIdle: 5,
      maxSize: 10,
    },
  }
);

// ioredis calculates slot internally
async function setUser(userId: number, data: Record<string, any>) {
  // ioredis hashes "user:123" and routes to correct node
  await cluster.set(`user:${userId}`, JSON.stringify(data), 'EX', 3600);
}

async function getUser(userId: number) {
  const data = await cluster.get(`user:${userId}`);
  return data ? JSON.parse(data) : null;
}

// Check which slot a key belongs to
function getSlot(key: string): number {
  const CRC16_TABLE = [
    0x0000, 0x1021, 0x2042, 0x3063, // ... full table omitted
  ];

  let crc = 0;
  for (const char of key) {
    crc = (crc << 8) ^ CRC16_TABLE[((crc >> 8) ^ char.charCodeAt(0)) & 0xFF];
    crc &= 0xFFFF;
  }
  return crc % 16384;
}

// Verify slot
console.log(`Key "user:123" belongs to slot ${getSlot('user:123')}`);

Multi-Key Operations and CROSSSLOT Errors

The Cluster's biggest gotcha: multi-key operations (MGET, MSET, pipelined operations) fail if keys are in different slots.

// CROSSSLOT ERROR: These keys are in different slots
async function exampleCROSSFAIL() {
  try {
    // FAILS: MGET across multiple slots
    await cluster.mget('user:1', 'order:2', 'product:3');
    // Error: CROSSSLOT Keys in request don't hash to the same slot
  } catch (err) {
    console.error(err.message);
  }

  try {
    // FAILS: pipelined commands across slots
    const pipe = cluster.pipeline();
    pipe.set('user:1', 'data1');
    pipe.set('order:2', 'data2');  // different slot
    await pipe.exec();
    // Error: CROSSSLOT in pipelined request
  } catch (err) {
    console.error(err.message);
  }
}

// SOLUTION 1: Use hash tags to force keys into same slot
// {tag} ensures keys with same tag go to same slot
async function usHashTags() {
  const userId = '123';

  // All keys with {123} tag go to same slot
  await cluster.set(`user:{${userId}}`, 'user_data');
  await cluster.set(`order:{${userId}}`, 'order_data');
  await cluster.set(`cart:{${userId}}`, 'cart_data');

  // Now these keys are in same slot, MGET works
  const [user, order, cart] = await cluster.mget(
    `user:{${userId}}`,
    `order:{${userId}}`,
    `cart:{${userId}}`
  );

  return { user, order, cart };
}

// SOLUTION 2: Separate commands per key (slower, but always works)
async function noMultiKeyOps() {
  const user = await cluster.get(`user:1`);
  const order = await cluster.get(`order:2`);
  const product = await cluster.get(`product:3`);
  return { user, order, product };
}

// SOLUTION 3: Use Lua script (executes on single node, safe)
async function luaScript() {
  const script = `
    local user = redis.call('GET', KEYS[1])
    local order = redis.call('GET', KEYS[2])
    return {user, order}
  `;

  // Hash tags ensure all keys execute on same node
  const userId = '123';
  const keys = [`user:{${userId}}`, `order:{${userId}}`];

  try {
    const result = await cluster.eval(script, keys.length, ...keys);
    console.log('Lua result:', result);
  } catch (err) {
    console.error('Lua error:', err.message);
  }
}

// SOLUTION 4: Redesign data model (best practice)
// Instead of separate user/order/cart keys, use HSET
async function betterDataModel() {
  const userId = '123';

  // Single HSET call (atomic, single slot)
  await cluster.hset(`user_data:{${userId}}`, 'info', 'user_json', 'orders', 'orders_json', 'cart', 'cart_json');

  // Read all related data in one command
  const { info, orders, cart } = await cluster.hgetall(`user_data:{${userId}}`);

  return { info, orders, cart };
}

Hot Slot Detection and Resharding

Some slots receive disproportionate traffic (hotspots). Detecting and resharding them prevents bottlenecks.

#!/bin/bash
# Monitor slot traffic using CLUSTER STATS (not built-in, must be custom)

# Basic cluster info
redis-cli -c -h redis-cluster info stats | grep total_commands_processed

# Find hot keys (CPU hotspots)
redis-cli -p 6379 --bigkeys
redis-cli -p 6379 --scan --pattern "user:*" | wc -l

# Monitor replication lag
redis-cli -p 6379 info replication | grep slave

# Check slot distribution
redis-cli -p 6379 cluster slots
# Output shows: slot_start slot_end node_id replica_ids

# Reshard single slot or range (careful operation)
redis-cli --cluster reshard CLUSTER_HOST:PORT --cluster-from SOURCE_NODE_ID \
  --cluster-to TARGET_NODE_ID --cluster-slots SLOT_COUNT --cluster-yes

# Reshard single hot slot
redis-cli --cluster reshard 127.0.0.1:7000 --cluster-from node-0 \
  --cluster-to node-1 --cluster-slots 1 --cluster-yes
// Node.js: Detect hot slots by monitoring key access patterns
import Redis from 'ioredis';

class HotSlotDetector {
  private cluster: Redis.Cluster;
  private slotAccessCount = new Map<number, number>();
  private updateInterval = 60000;  // Sample every 60 seconds

  constructor(cluster: Redis.Cluster) {
    this.cluster = cluster;
    this.startMonitoring();
  }

  private async startMonitoring() {
    setInterval(async () => {
      // Sample hot keys (uses SCAN internally)
      const hotKeys = await this.cluster.call('COMMAND');

      // Calculate slot distribution
      const slotStats = new Map<number, number>();
      for (const [slot, count] of this.slotAccessCount.entries()) {
        slotStats.set(slot, count);
      }

      // Find top 10 hot slots
      const sorted = Array.from(slotStats.entries())
        .sort((a, b) => b[1] - a[1])
        .slice(0, 10);

      console.log('Top 10 hot slots:', sorted);

      // Alert if one slot > 50% of traffic
      const totalOps = Array.from(slotStats.values()).reduce((a, b) => a + b, 0);
      const topSlot = sorted[0];
      if (topSlot && topSlot[1] / totalOps > 0.5) {
        console.warn(`HOT SLOT DETECTED: slot ${topSlot[0]} has ${(topSlot[1] / totalOps * 100).toFixed(1)}% of traffic`);
      }

      this.slotAccessCount.clear();
    }, this.updateInterval);
  }

  recordAccess(key: string) {
    const CRC16 = (key: string) => {
      let crc = 0;
      for (const char of key) {
        crc = ((crc << 8) ^ ((crc >> 8) ^ char.charCodeAt(0))) & 0xFFFF;
      }
      return crc % 16384;
    };

    const slot = CRC16(key);
    this.slotAccessCount.set(slot, (this.slotAccessCount.get(slot) || 0) + 1);
  }
}

Cluster-Aware Client Configuration

ioredis is the production Redis client for Node.js. Proper configuration is critical.

import Redis from 'ioredis';
import { performance } from 'perf_hooks';

const cluster = new Redis.Cluster(
  [
    { host: 'redis-1.example.com', port: 6379 },
    { host: 'redis-2.example.com', port: 6379 },
    { host: 'redis-3.example.com', port: 6379 },
  ],
  {
    // Connection options
    connectTimeout: 5000,
    maxRedirections: 16,  // max MOVED/ASK redirects
    retryDelayOnFailover: 100,
    retryDelayOnClusterDown: 300,
    slotsRefreshTimeout: 2000,
    dnsLookupServiceSRV: false,

    // Node options
    enableReadyCheck: false,
    enableOfflineQueue: true,
    lazyConnect: false,

    // Pooling (per node)
    nodeConnectionPool: {
      minIdle: 5,
      maxSize: 10,
    },

    // Cluster options
    clusterRetryStrategy: (times) => {
      // Exponential backoff on cluster failures
      return Math.min(100 * Math.pow(2, times), 2000);
    },

    // Redirection strategies
    redisOptions: {
      socket: {
        keepAlive: 30000,
      },
    },
  }
);

// Event handling
cluster.on('error', (err) => {
  console.error('Cluster error:', err.message);
  // Alert ops team
});

cluster.on('node error', (err, node) => {
  console.error(`Node ${node} error:`, err.message);
});

cluster.on('ready', () => {
  console.log('Cluster ready');
});

cluster.on('connecting', () => {
  console.log('Cluster connecting');
});

// Health check endpoint
export async function healthCheck() {
  const start = performance.now();
  try {
    const pong = await cluster.ping();
    const duration = performance.now() - start;

    return {
      status: 'healthy',
      latency_ms: duration.toFixed(2),
      nodes: cluster.nodes('all').length,
    };
  } catch (err) {
    return {
      status: 'unhealthy',
      error: err.message,
    };
  }
}

// Graceful shutdown
export async function shutdownCluster() {
  await cluster.quit();
  console.log('Cluster connection closed');
}

Replication and Automatic Failover

Redis Cluster automatically promotes replicas to masters when masters fail.

Master: 192.168.1.10:6379 (master, owns slots 0-5461)
  ├── Replica-1: 192.168.1.11:6379 (synced)
  └── Replica-2: 192.168.1.12:6379 (synced)

Failure Scenario:
1. Master goes down
2. Cluster detects no response (cluster-node-timeout: 15 seconds)
3. Cluster marks master as FAIL
4. Quorum of masters votes to promote replica
5. Replica-1 becomes new master
6. Cluster reconfigures and continues operating
# Monitor replication status
redis-cli -p 6379 info replication
# Output:
# role:master
# connected_slaves:2
# slave0:ip=192.168.1.11,port=6379,state=online,offset=12345,lag=0
# slave1:ip=192.168.1.12,port=6379,state=online,offset=12345,lag=0

# Check replica lag
redis-cli -p 6379 info replication | grep replication_offset
redis-cli -p 6380 info replication | grep replication_offset
# If offsets differ: replication is lagging

# Force failover (testing)
redis-cli -p 6379 CLUSTER FAILOVER

# Monitor failover progress
redis-cli -p 6379 CLUSTER INFO | grep cluster_state
# Expected: cluster_state:ok (after failover complete)

# Disable automatic failover (for maintenance)
redis-cli -p 6379 CLUSTER FAILOVER TAKEOVER
// Application-side: Handle failover gracefully
import Redis from 'ioredis';

export class FailoverAwareCache {
  private cluster: Redis.Cluster;
  private writeBuffer: Array<{ key: string; value: any; ttl?: number }> = [];
  private isFailingOver = false;

  constructor(cluster: Redis.Cluster) {
    this.cluster = cluster;
    this.cluster.on('error', () => {
      this.isFailingOver = true;
      // Buffer writes during failover
    });
  }

  async set(key: string, value: any, ttl?: number) {
    if (this.isFailingOver) {
      // Buffer write if cluster is failing over
      this.writeBuffer.push({ key, value, ttl });
      return true;
    }

    try {
      if (ttl) {
        await this.cluster.setex(key, ttl, JSON.stringify(value));
      } else {
        await this.cluster.set(key, JSON.stringify(value));
      }
    } catch (err) {
      if (err.message.includes('CLUSTERDOWN')) {
        // Failover in progress, buffer write
        this.writeBuffer.push({ key, value, ttl });
      } else {
        throw err;
      }
    }
  }

  async flushBufferedWrites() {
    if (this.writeBuffer.length === 0) return;

    const buffer = this.writeBuffer;
    this.writeBuffer = [];

    for (const { key, value, ttl } of buffer) {
      try {
        await this.set(key, value, ttl);
      } catch (err) {
        console.error(`Failed to flush write for ${key}:`, err);
        // Re-add to buffer for next attempt
        this.writeBuffer.push({ key, value, ttl });
      }
    }
  }
}

CLUSTER INFO Monitoring

Monitor cluster health with regular CLUSTER INFO queries.

# Real-time cluster status
redis-cli -p 6379 CLUSTER INFO

# Output interpretation:
cluster_state:ok              # green: all slots assigned
cluster_slots_assigned:16384  # all 16384 slots have owners
cluster_slots_ok:16384        # all slots functional
cluster_slots_pfail:0         # likely failing slots
cluster_slots_fail:0          # confirmed failing slots
cluster_known_nodes:6         # total nodes in cluster
cluster_size:3                # master nodes
cluster_current_epoch:5       # cluster version
cluster_my_epoch:5            # this node's epoch
cluster_stats_messages_sent:123456
cluster_stats_messages_received:123456

# If cluster_state:fail: investigate immediately
redis-cli -p 6379 CLUSTER NODES | grep fail

Lua Scripts in Cluster Mode

Lua scripts require all keys to be on same node. Use hash tags or single-key scripts.

// Lua script: atomic multi-step operation
const incrementAndCheckQuota = `
  local current = redis.call('INCR', KEYS[1])
  local quota = tonumber(redis.call('GET', KEYS[2]))

  if current > quota then
    redis.call('DECR', KEYS[1])
    return {0, current - 1}
  end

  return {1, current}
`;

async function checkRateLimit(userId: string, limit: number) {
  // Force both keys to same slot with hash tag
  const counterKey = `ratelimit_counter:{${userId}}`;
  const quotaKey = `ratelimit_quota:{${userId}}`;

  // Set quota (once)
  await cluster.set(quotaKey, limit, 'EX', 3600);

  // Check limit with atomic script
  const result = await cluster.eval(
    incrementAndCheckQuota,
    2,  // 2 keys
    counterKey,
    quotaKey
  );

  const [allowed, currentCount] = result as [number, number];
  return { allowed: allowed === 1, currentCount };
}

Checklist

  • Redis Cluster deployed with 3+ master nodes
  • Replicas configured (typically 1 replica per master)
  • Hash tags used for multi-key operations
  • ioredis configured with proper retry and redirection settings
  • Cluster timeout (cluster-node-timeout) set to 15+ seconds
  • Monitoring alerts for CROSSSLOT errors
  • Hot slot detection implemented
  • Resharding procedure documented and tested
  • Replication lag monitoring configured
  • Application handles CLUSTERDOWN gracefully
  • Lua scripts tested with hash-tagged keys
  • Failover testing performed (kill master, verify recovery)

Conclusion

Redis Cluster provides horizontal scaling but demands careful design to avoid CROSSSLOT errors and slot imbalances. Use hash tags to co-locate related keys, implement hot slot monitoring, and configure ioredis for proper failover handling. Lua scripts provide atomic multi-key operations when properly constrained to single slots. With proper configuration and awareness of Cluster's limitations, you can build high-throughput, fault-tolerant caching layers at scale.