- Published on
Missing Database Index — Why Your App Slows Down as It Grows
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
Indexes are the single most impactful database optimization you can make. A well-placed index turns a 30-second query into a 3-millisecond one. But they're easy to miss: you add a foreign key column, forget to index it, and queries work fine — until the table has 10 million rows and every join becomes a full scan.
- How to Find Missing Indexes
- Common Missing Index Patterns
- 1. Foreign Keys Without Indexes
- 2. Timestamp Columns Used in Range Queries
- 3. Status/Enum Columns Used in WHERE Clauses
- 4. Columns Used in ORDER BY
- Multi-Column Indexes — Order Matters
- Adding Indexes Without Downtime
- Automatically Suggested Indexes
- Index Bloat — When Indexes Grow Stale
- Missing Index Checklist
- Conclusion
How to Find Missing Indexes
PostgreSQL tracks every sequential scan. Queries that do lots of seq scans on big tables are your index candidates:
-- Find tables with lots of sequential scans (potential missing indexes)
SELECT
schemaname,
tablename,
seq_scan,
seq_tup_read,
idx_scan,
n_live_tup AS row_count,
ROUND(seq_scan::numeric / NULLIF(idx_scan + seq_scan, 0) * 100, 2) AS seq_scan_pct
FROM pg_stat_user_tables
WHERE n_live_tup > 10000 -- only care about big tables
AND seq_scan > 100 -- being scanned frequently
ORDER BY seq_tup_read DESC -- worst first
LIMIT 20;
-- Find indexes that are never used (waste space, slow down writes)
SELECT
schemaname,
tablename,
indexname,
idx_scan AS index_scans
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexname NOT LIKE '%pkey%' -- skip primary keys
ORDER BY pg_relation_size(indexrelid) DESC;
Common Missing Index Patterns
1. Foreign Keys Without Indexes
-- ❌ Foreign key with no index — every JOIN does a seq scan on orders
CREATE TABLE orders (
id UUID PRIMARY KEY,
user_id UUID REFERENCES users(id), -- No index!
created_at TIMESTAMPTZ
);
-- Every query like this scans ALL orders to find the user's:
SELECT * FROM orders WHERE user_id = '123'
-- ✅ Index the foreign key
CREATE INDEX ON orders (user_id);
-- Query now: Index Scan — O(log n) instead of O(n)
2. Timestamp Columns Used in Range Queries
-- ❌ No index on created_at — date range query scans everything
SELECT * FROM events WHERE created_at > NOW() - INTERVAL '7 days'
-- ✅ Index the timestamp column
CREATE INDEX ON events (created_at);
-- Or if you always filter on both status AND date:
CREATE INDEX ON events (status, created_at);
-- The composite index covers: WHERE status = 'pending' AND created_at > ...
3. Status/Enum Columns Used in WHERE Clauses
-- ❌ Scanning millions of jobs to find the 100 that are 'pending'
SELECT * FROM jobs WHERE status = 'pending' ORDER BY created_at LIMIT 100
-- ✅ Partial index — only indexes rows where status = 'pending'
-- Much smaller than a full index, faster updates
CREATE INDEX ON jobs (created_at) WHERE status = 'pending';
4. Columns Used in ORDER BY
-- ❌ Sorting 5 million rows without an index — filesort in memory
SELECT * FROM articles ORDER BY published_at DESC LIMIT 20
-- ✅ Index the sort column
CREATE INDEX ON articles (published_at DESC);
-- Or with a filter:
CREATE INDEX ON articles (published_at DESC) WHERE published = true;
Multi-Column Indexes — Order Matters
-- Query: WHERE tenant_id = ? AND status = ? ORDER BY created_at DESC
-- ❌ Wrong index — can't use it for the full query
CREATE INDEX ON orders (created_at, tenant_id, status);
-- ✅ Right order: equality columns first, range/sort column last
CREATE INDEX ON orders (tenant_id, status, created_at DESC);
-- Rules: equality filters first, then range/ORDER BY columns
The "leftmost prefix" rule: a composite index (a, b, c) can be used for queries filtering on a, a + b, or a + b + c — but not b alone or c alone.
Adding Indexes Without Downtime
In PostgreSQL, a regular CREATE INDEX locks the table for writes during the build. On a large table, this can take minutes:
-- ❌ Locks table for writes during index build (minutes on large table)
CREATE INDEX ON orders (user_id);
-- ✅ CONCURRENTLY — builds index without write lock
-- Slower, can't run in a transaction, but doesn't block
CREATE INDEX CONCURRENTLY ON orders (user_id);
// In migrations — always use CONCURRENTLY for large tables
// knex migration example:
exports.up = async (knex) => {
// Run concurrently to avoid locking production
await knex.raw('CREATE INDEX CONCURRENTLY IF NOT EXISTS orders_user_id_idx ON orders (user_id)')
}
Automatically Suggested Indexes
PostgreSQL can tell you what indexes it wishes existed:
-- After running a slow query, check pg_stat_user_tables
-- But better: ask the query planner directly
EXPLAIN (FORMAT JSON, BUFFERS)
SELECT * FROM orders WHERE user_id = '123' AND status = 'pending';
-- If you see "Seq Scan" with high cost, the planner is telling you
-- it would prefer an index here
-- Enable pg_stat_statements and look for high mean_exec_time:
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
WHERE mean_exec_time > 100 -- queries averaging > 100ms
ORDER BY mean_exec_time DESC;
Index Bloat — When Indexes Grow Stale
-- Dead rows from UPDATE/DELETE accumulate in indexes
-- Check index bloat:
SELECT
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
idx_scan
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;
-- Rebuild bloated indexes (CONCURRENTLY to avoid locks):
REINDEX INDEX CONCURRENTLY orders_user_id_idx;
-- Or autovacuum handles this automatically if configured correctly
Missing Index Checklist
- ✅ Index every foreign key column
- ✅ Index columns used in
WHERE,ORDER BY,GROUP BYon large tables - ✅ Use partial indexes for low-cardinality status columns
- ✅ Use composite indexes with equality columns first, range/sort last
- ✅ Use
CREATE INDEX CONCURRENTLYin production migrations - ✅ Run
EXPLAIN ANALYZEon slow queries to confirm index usage - ✅ Check
pg_stat_user_tablesregularly for highseq_scancounts - ✅ Drop unused indexes (
idx_scan = 0) to reduce write overhead
Conclusion
Missing indexes are the most common cause of "it was fast in dev, slow in prod" — the data volume difference hides the problem. The fix is usually two SQL lines: find the missing index with EXPLAIN ANALYZE or pg_stat_user_tables, then CREATE INDEX CONCURRENTLY. The hardest part is knowing which columns to index: foreign keys always, timestamp columns used in range queries, status columns with partial indexes, and ORDER BY columns. Set up slow query logging and check pg_stat_statements weekly — you'll catch problems long before users do.