- Published on
Migration Locking the Table — The ALTER TABLE That Took Down Production
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
Database migrations are the riskiest part of most deployments. A migration that adds a NOT NULL column with a DEFAULT rewrites every row. A migration that adds a regular index locks the table for writes. On a small table, this is milliseconds. On a 40-million row table, it's minutes — and every write to that table is blocked the entire time.
- The Problem
- Which ALTER TABLE Operations Are Dangerous
- Safe Pattern: Expand/Contract Migrations
- Step 1: Add column as nullable (no default, no constraint)
- Step 2: Backfill in batches (background job, not migration)
- Step 3: Add NOT NULL constraint without validation
- Safe Index Creation
- Safely Adding Foreign Key Constraints
- Lock Timeout as a Safety Net
- Detecting Long-Running Migrations
- Migration Checklist
- Conclusion
The Problem
ALTER TABLE orders ADD COLUMN priority VARCHAR(20) DEFAULT 'normal' NOT NULL;
Production table: 40 million rows
What PostgreSQL does: rewrites every single row to add the new column
Duration: ~8 minutes
Lock held during rewrite: ACCESS EXCLUSIVE (blocks ALL reads AND writes)
Result: Your app can't read or write orders for 8 minutes
Which ALTER TABLE Operations Are Dangerous
-- ❌ DANGEROUS — rewrites table (PostgreSQL < 11)
ALTER TABLE orders ADD COLUMN priority VARCHAR DEFAULT 'normal' NOT NULL;
-- PostgreSQL must update every row to set the default
-- ✅ Safe in PostgreSQL 11+ — no table rewrite
ALTER TABLE orders ADD COLUMN priority VARCHAR DEFAULT 'normal' NOT NULL;
-- PostgreSQL 11+: stores default in catalog, no row rewrite
-- ❌ DANGEROUS — any change to column type rewrites table
ALTER TABLE users ALTER COLUMN age TYPE BIGINT;
-- ❌ DANGEROUS — adding NOT NULL to existing column (even in PG 11+)
ALTER TABLE orders ALTER COLUMN user_id SET NOT NULL;
-- ❌ DANGEROUS — regular index creation (not concurrent)
CREATE INDEX ON orders (user_id);
-- ❌ DANGEROUS — adding constraints that need validation
ALTER TABLE orders ADD CONSTRAINT check_status CHECK (status IN ('pending', 'paid', 'cancelled'));
Safe Pattern: Expand/Contract Migrations
The "expand/contract" (or "parallel change") pattern lets you evolve schema without locking:
Step 1: Add column as nullable (no default, no constraint)
-- Deploy 1: Safe — just adds a null column, no rewrite
ALTER TABLE orders ADD COLUMN priority VARCHAR(20);
-- Instant, no lock, no row rewrite
Step 2: Backfill in batches (background job, not migration)
// Backfill 1000 rows at a time — no lock held between batches
async function backfillPriority() {
let lastId = '00000000-0000-0000-0000-000000000000'
while (true) {
const result = await db.query(`
UPDATE orders
SET priority = 'normal'
WHERE id > $1
AND priority IS NULL
ORDER BY id
LIMIT 1000
RETURNING id
`, [lastId])
if (result.rows.length === 0) break
lastId = result.rows[result.rows.length - 1].id
console.log(`Backfilled up to ${lastId}`)
await sleep(100) // brief pause to reduce database load
}
}
Step 3: Add NOT NULL constraint without validation
-- Deploy 2: Add constraint without checking existing rows
-- (Safe because backfill ensures all rows are filled)
ALTER TABLE orders
ALTER COLUMN priority SET DEFAULT 'normal',
ALTER COLUMN priority SET NOT NULL;
-- Or use NOT VALID to skip row-level validation scan:
ALTER TABLE orders ADD CONSTRAINT orders_priority_not_null
CHECK (priority IS NOT NULL) NOT VALID;
-- Later, validate in a separate migration (takes a SHARE UPDATE EXCLUSIVE lock — allows reads/writes)
ALTER TABLE orders VALIDATE CONSTRAINT orders_priority_not_null;
Safe Index Creation
-- ❌ Blocks writes for the entire index build (minutes on large table)
CREATE INDEX ON orders (user_id);
-- ✅ CONCURRENTLY — no write lock, builds in background
CREATE INDEX CONCURRENTLY ON orders (user_id);
-- Takes 2-3x longer but doesn't block anything
-- Can't run inside a transaction block
// In a migration file (knex example)
exports.up = async (knex) => {
// Can't use knex.schema helpers for concurrent index — use raw SQL
await knex.raw(`
CREATE INDEX CONCURRENTLY IF NOT EXISTS
orders_user_id_idx ON orders (user_id)
`)
}
// Note: knex transactions wrap migrations by default
// For CONCURRENT index, disable the transaction wrapper:
exports.config = { transaction: false }
Safely Adding Foreign Key Constraints
-- ❌ Validates every row while holding lock
ALTER TABLE orders ADD CONSTRAINT fk_user
FOREIGN KEY (user_id) REFERENCES users(id);
-- ✅ Step 1: Add constraint without validating existing rows
ALTER TABLE orders ADD CONSTRAINT fk_user
FOREIGN KEY (user_id) REFERENCES users(id)
NOT VALID;
-- ✅ Step 2: Validate separately (SHARE UPDATE EXCLUSIVE lock — allows writes)
ALTER TABLE orders VALIDATE CONSTRAINT fk_user;
Lock Timeout as a Safety Net
Set a lock timeout so that if your migration can't get the lock quickly, it fails fast rather than queueing and blocking all subsequent queries:
-- In your migration script — fail fast if table is busy
SET lock_timeout = '5s';
-- If it can't get the lock within 5 seconds, raises error:
-- ERROR: canceling statement due to lock timeout
-- Much better than silently queuing and blocking for minutes
ALTER TABLE orders ADD COLUMN ...;
// In Node.js migration framework
const client = await pool.connect()
try {
await client.query("SET lock_timeout = '5s'")
await client.query("SET statement_timeout = '60s'")
await client.query('ALTER TABLE orders ADD COLUMN ...')
} finally {
client.release()
}
Detecting Long-Running Migrations
-- Monitor active locks and waiting queries
SELECT
pid,
now() - query_start AS duration,
state,
wait_event_type,
wait_event,
left(query, 100) AS query_snippet
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY duration DESC;
-- Find what's blocking what
SELECT
blocked.pid,
blocked.query AS blocked_query,
blocking.pid AS blocking_pid,
blocking.query AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking
ON blocking.pid = ANY(pg_blocking_pids(blocked.pid));
Migration Checklist
- ✅ Use
ALTER TABLE ... ADD COLUMNwithout NOT NULL on PG < 11 - ✅ Backfill new columns in batches via background jobs, not migrations
- ✅ Use
NOT VALIDwhen adding constraints, thenVALIDATEseparately - ✅ Use
CREATE INDEX CONCURRENTLY— always - ✅ Set
lock_timeoutto fail fast rather than queue indefinitely - ✅ Test migrations on a production-sized data copy first
- ✅ Deploy schema migrations separately from code migrations (expand, then contract)
- ✅ Monitor
pg_stat_activityduring migration deploys
Conclusion
Schema changes on large tables require thinking in phases: expand first (add nullable columns, non-enforced constraints), backfill in the background, then contract (add NOT NULL, validate constraints). Never run CREATE INDEX without CONCURRENTLY. Always set lock_timeout so a long-queued migration doesn't block your entire app silently. The safest migrations are the ones that can be run during business hours without anyone noticing — and that's achievable with the expand/contract pattern.