- Published on
Improper Sharding Strategy — When Your "Scalable" Database Isn't
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
Sharding splits your database across multiple nodes to scale writes and storage beyond what a single database can handle. But the sharding key determines everything — choose wrong and you get hot spots (some shards overwhelmed, others idle), cross-shard queries (expensive joins across nodes), or uneven growth (the current time shard fills up while historical shards sit empty).
- The Hot Shard Problem
- Choosing the Right Shard Key
- Consistent Hashing — Shard Remapping Without Full Rebalancing
- Avoiding Cross-Shard Joins
- Directory-Based Sharding (Most Flexible)
- Monitoring for Hot Shards
- Sharding Checklist
- Conclusion
The Hot Shard Problem
Sharding by user_id modulo 8 (8 shards):
- Shard 0: user_ids ending in 0 or 8
- Shard 1: user_ids ending in 1 or 9
- ...
Problem: Your top 100 enterprise customers (each with 10,000 employees)
all happen to hash to shard 3.
- Shard 3: 100% CPU, 95% storage
- Other shards: 30% CPU, 40% storage
This isn't hypothetical — with modulo sharding, any clustering
in the data distribution creates hot shards.
Choosing the Right Shard Key
Good shard keys:
- UUID v4 (random) — perfectly even distribution
- High-cardinality IDs with random component
- Hash of customer_id (if customer IDs are sequential, hash them first)
Bad shard keys:
- Sequential integer IDs (mod N → uneven for non-uniform write patterns)
- Status column ('pending', 'complete') — all new data hits 'pending' shard
- Date/timestamp — all writes hit current time shard
- Tenant size — small tenants on shard A, large tenants on shard B → uneven
- Country code — most traffic from US → US shard is hot
Consistent Hashing — Shard Remapping Without Full Rebalancing
import * as crypto from 'crypto'
class ConsistentHashRing {
private ring = new Map<number, string>() // hash position → shard name
private sortedKeys: number[] = []
private virtualNodes = 150 // more virtual nodes = more even distribution
addShard(shardId: string): void {
for (let i = 0; i < this.virtualNodes; i++) {
const virtualNodeKey = `${shardId}:${i}`
const hash = this.hash(virtualNodeKey)
this.ring.set(hash, shardId)
}
this.sortedKeys = [...this.ring.keys()].sort((a, b) => a - b)
}
removeShard(shardId: string): void {
for (let i = 0; i < this.virtualNodes; i++) {
const hash = this.hash(`${shardId}:${i}`)
this.ring.delete(hash)
}
this.sortedKeys = [...this.ring.keys()].sort((a, b) => a - b)
}
getShard(key: string): string {
if (this.ring.size === 0) throw new Error('No shards available')
const hash = this.hash(key)
// Find first shard clockwise from this hash
const idx = this.sortedKeys.findIndex((k) => k >= hash)
const position = idx === -1 ? this.sortedKeys[0] : this.sortedKeys[idx]
return this.ring.get(position)!
}
private hash(key: string): number {
const buf = crypto.createHash('md5').update(key).digest()
return buf.readUInt32BE(0)
}
}
const ring = new ConsistentHashRing()
ring.addShard('db-1')
ring.addShard('db-2')
ring.addShard('db-3')
// Consistent: same userId always goes to same shard
// Adding shard 4: only ~25% of keys need to move (not 100%)
ring.addShard('db-4')
function getShardForUser(userId: string): string {
return ring.getShard(userId)
}
Avoiding Cross-Shard Joins
The biggest operational cost of sharding is queries that need data from multiple shards:
// ❌ Cross-shard join — requires querying all shards, merging results
async function getUserOrdersAndProfile(userId: string) {
const userShard = getShardForUser(userId)
// User is on shard 2
const user = await shards[userShard].query('SELECT * FROM users WHERE id = $1', [userId])
// But orders might be sharded differently!
// If sharded by order_id (not user_id), this order could be on any shard
const orders = await Promise.all(
Object.values(shards).map(shard =>
shard.query('SELECT * FROM orders WHERE user_id = $1', [userId])
)
)
// Merges all shard results — expensive, complex
return { user: user.rows[0], orders: orders.flatMap(r => r.rows) }
}
// ✅ Co-locate related data on the same shard
// Shard orders by user_id (same key as users table)
// Then user AND their orders are on the same shard — no cross-shard join needed
async function getUserOrders(userId: string) {
const shard = getShardForUser(userId)
// Both queries go to the same shard
const [user, orders] = await Promise.all([
shards[shard].query('SELECT * FROM users WHERE id = $1', [userId]),
shards[shard].query('SELECT * FROM orders WHERE user_id = $1', [userId]),
])
return { user: user.rows[0], orders: orders.rows }
}
Directory-Based Sharding (Most Flexible)
// Lookup table: entity → shard assignment
// Allows explicit rebalancing and handles uneven data without algorithm changes
class ShardDirectory {
async getShard(entityType: string, entityId: string): Promise<string> {
// Check lookup table first
const entry = await metaDb.query(
'SELECT shard_id FROM shard_directory WHERE entity_type = $1 AND entity_id = $2',
[entityType, entityId]
)
if (entry.rows.length > 0) {
return entry.rows[0].shard_id
}
// New entity — assign to least-loaded shard
const shard = await this.assignToLeastLoadedShard(entityType)
await metaDb.query(
'INSERT INTO shard_directory (entity_type, entity_id, shard_id) VALUES ($1, $2, $3)',
[entityType, entityId, shard]
)
return shard
}
private async assignToLeastLoadedShard(entityType: string): Promise<string> {
const loads = await metaDb.query(
'SELECT shard_id, COUNT(*) as entity_count FROM shard_directory WHERE entity_type = $1 GROUP BY shard_id',
[entityType]
)
// Return shard with fewest entities
return loads.rows.sort((a, b) => a.entity_count - b.entity_count)[0]?.shard_id ?? 'shard-1'
}
}
Monitoring for Hot Shards
// Check per-shard query rates and storage
async function checkShardBalance() {
const stats = await Promise.all(
Object.entries(shards).map(async ([name, pool]) => {
const result = await pool.query(`
SELECT
pg_database_size(current_database()) AS size_bytes,
(SELECT sum(n_live_tup) FROM pg_stat_user_tables) AS row_count
`)
return { shard: name, ...result.rows[0] }
})
)
const avgSize = stats.reduce((s, r) => s + Number(r.size_bytes), 0) / stats.length
for (const stat of stats) {
const ratio = Number(stat.size_bytes) / avgSize
if (ratio > 1.5) {
console.warn(`Hot shard detected: ${stat.shard} is ${ratio.toFixed(1)}x average size`)
}
}
}
Sharding Checklist
- ✅ Choose shard key that distributes writes evenly (prefer UUIDs over sequential IDs)
- ✅ Co-locate related data on the same shard — avoid cross-shard joins
- ✅ Use consistent hashing for online shard addition without full rebalancing
- ✅ Monitor per-shard query rates and storage size — alert on imbalance
- ✅ Consider directory-based sharding for fine-grained control
- ✅ Test shard distribution with production-like data before going live
- ✅ Plan for rebalancing from day one — you will need it eventually
Conclusion
Sharding is the right answer when you've exhausted vertical scaling and read replicas. But a bad sharding key turns a scaling solution into a hot-spot generator. The safest shard key is a uniformly distributed UUID. The most important constraint is co-locating data your queries join — sharding users by user_id means orders should also shard by user_id so they end up on the same node. Use consistent hashing to allow adding shards online, monitor for hot shards from day one, and have a rebalancing plan before you need one.