- Published on
PostgreSQL Performance Tuning — From Slow Queries to Sub-Millisecond Responses
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
PostgreSQL performance tuning separates operators who understand their infrastructure from those running blind. This guide walks through real-world tuning strategies: reading execution plans, configuring memory parameters, optimizing autovacuum for write-heavy tables, and pooling connections to eliminate connection overhead. Every optimization is backed by measurable metrics.
- Reading EXPLAIN ANALYZE: Seq Scan vs Index Scan vs Hash Join
- work_mem: Sorting and Aggregation Performance
- Checkpoint and WAL Tuning
- Autovacuum Configuration for High-Write Tables
- pg_stat_statements: Finding Slow Queries
- Connection Overhead vs PgBouncer Pooling
- shared_buffers and effective_cache_size Tuning
- Checklist
- Conclusion
Reading EXPLAIN ANALYZE: Seq Scan vs Index Scan vs Hash Join
EXPLAIN ANALYZE is your primary diagnostic tool. Understanding what the planner decides—and why—is foundational.
-- Enable timing and verbose output
EXPLAIN (ANALYZE, BUFFERS, TIMING, VERBOSE)
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.created_at > NOW() - INTERVAL '30 days'
AND o.status = 'shipped'
LIMIT 100;
-- Output interpretation:
-- Seq Scan: table full scan, slow for large tables (unless filtering < 1%)
-- Index Scan: uses index, much faster if selectivity > 5%
-- Index Only Scan: all columns in index, no table access needed (fastest)
-- Hash Join: builds hash table of smaller input, fast for large joins
-- Nested Loop: one tuple at a time, slow for large result sets
-- Bad plan example: seq scan on 10M row table (fix: add index)
Seq Scan on orders (cost=0.00..450000.00 rows=500000)
Filter: (created_at > ...) AND (status = 'shipped')
Rows: 450000 (actual 450000)
Buffers: shared hit=150000 read=50000
-- Good plan example: index scan with limit
Index Scan using idx_orders_created_at on orders (cost=0.28..1200.00 rows=100)
Index Cond: (created_at > ...)
Filter: (status = 'shipped')
Rows: 100 (actual 100)
Buffers: shared hit=95 read=5
-- Track actual vs estimated rows (major planning errors)
-- If actual >> estimated: add statistics, ANALYZE table
-- If actual << estimated: query is highly selective, consider partial index
ANALYZE orders;
ANALYZE customers;
work_mem: Sorting and Aggregation Performance
work_mem controls how much memory each operation uses before spilling to disk. Larger = faster, but must be multiplied by concurrent operations.
-- Current work_mem setting
SHOW work_mem;
-- Default: 4MB (way too small for modern hardware)
-- Calculation: (total_available_memory - shared_buffers) / max_connections / 2
-- Example: 32GB RAM, shared_buffers=8GB, max_connections=200
-- work_mem = (32 - 8) * 1024 / 200 / 2 = 61MB
-- Set for session (testing)
SET work_mem = '256MB';
-- Set for user (persistent across connections)
ALTER USER myapp SET work_mem = '256MB';
-- Set globally (postgresql.conf)
-- work_mem = 256MB
-- Query using sort (will spill to disk if work_mem too small)
EXPLAIN ANALYZE
SELECT customer_id, COUNT(*) as order_count, SUM(total) as lifetime_value
FROM orders
GROUP BY customer_id
ORDER BY lifetime_value DESC;
-- Check if sort used temp space
EXPLAIN ANALYZE
SELECT *
FROM (
SELECT id, name, RANDOM() as rand_sort
FROM users
ORDER BY rand_sort
) t
LIMIT 10;
-- Monitor actual memory usage
SELECT
query,
mean_exec_time,
calls,
max_exec_time
FROM pg_stat_statements
WHERE query ILIKE '%ORDER BY%' OR query ILIKE '%GROUP BY%'
ORDER BY mean_exec_time DESC
LIMIT 10;
Checkpoint and WAL Tuning
Checkpoints synchronize committed data to disk. Tuning checkpoint frequency affects both write latency and crash recovery time.
-- Current checkpoint settings
SHOW checkpoint_timeout; -- Default: 5min
SHOW checkpoint_completion_target; -- Default: 0.9 (90%)
SHOW max_wal_size; -- Default: 1GB
SHOW min_wal_size; -- Default: 80MB
-- Aggressive checkpointing (lower latency, more I/O)
-- postgresql.conf settings:
checkpoint_timeout = '5min'
max_wal_size = '2GB'
checkpoint_completion_target = 0.9
-- Conservative checkpointing (higher throughput)
checkpoint_timeout = '15min'
max_wal_size = '10GB'
checkpoint_completion_target = 0.5
-- Monitor checkpoint performance
SELECT
checkpoints_timed,
checkpoints_req,
checkpoint_write_time,
checkpoint_sync_time,
total_checkpoint_time,
CASE WHEN checkpoints_timed + checkpoints_req = 0 THEN 0
ELSE ROUND(100.0 * checkpoints_timed / (checkpoints_timed + checkpoints_req), 2)
END AS timed_percent
FROM pg_stat_bgwriter;
-- If mostly time-based (checkpoints_timed high): increase timeout
-- If mostly request-based (checkpoints_req high): increase max_wal_size
-- WAL archive settings (for PITR)
archive_mode = on
archive_command = 'test ! -f /archive/%f && cp %p /archive/%f'
archive_timeout = '300'
Autovacuum Configuration for High-Write Tables
Autovacuum removes dead rows and updates statistics. Misconfigured autovacuum causes bloat and seq scans.
-- Current autovacuum settings
SHOW autovacuum;
SHOW autovacuum_naptime; -- Default: 1min (check interval)
SHOW autovacuum_vacuum_threshold; -- Default: 50 dead rows
SHOW autovacuum_vacuum_scale_factor; -- Default: 0.2 (20%)
-- Formula: trigger when dead_rows > threshold + (table_rows * scale_factor)
-- Table with 1M rows: trigger at 50 + (1000000 * 0.2) = 200,050 dead rows
-- Find tables needing aggressive autovacuum
SELECT
schemaname,
tablename,
n_dead_tup,
n_live_tup,
ROUND(100 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_percent,
last_vacuum,
last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;
-- Aggressive autovacuum for high-write table (events log)
-- Add to postgresql.conf or use ALTER TABLE
ALTER TABLE events SET (
autovacuum_vacuum_scale_factor = 0.05, -- 5% instead of 20%
autovacuum_vacuum_cost_delay = 2, -- faster, lower I/O
autovacuum_vacuum_cost_limit = 1000
);
-- Lazy autovacuum for read-heavy archive tables
ALTER TABLE archived_orders SET (
autovacuum_vacuum_scale_factor = 0.5, -- 50%, less frequent
autovacuum_enabled = false -- manual vacuum only
);
-- Force autovacuum on specific table
VACUUM ANALYZE events;
-- Monitor autovacuum activity
SELECT
schemaname,
tablename,
last_vacuum,
last_autovacuum,
vacuum_count,
autovacuum_count
FROM pg_stat_user_tables
ORDER BY autovacuum_count DESC
LIMIT 10;
pg_stat_statements: Finding Slow Queries
pg_stat_statements aggregates query statistics. Use it to identify offenders.
-- Enable extension (requires restart)
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Reset statistics (start fresh)
SELECT pg_stat_statements_reset();
-- Top 10 slowest queries by total time
SELECT
query,
calls,
total_exec_time,
mean_exec_time,
stddev_exec_time,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
-- Top 10 slowest queries by mean (most consistently slow)
SELECT
LEFT(query, 80),
calls,
ROUND(mean_exec_time::numeric, 2) as avg_ms,
ROUND(stddev_exec_time::numeric, 2) as stddev_ms
FROM pg_stat_statements
WHERE calls > 100 -- filter out rare queries
ORDER BY mean_exec_time DESC
LIMIT 10;
-- Queries with worst I/O patterns
SELECT
LEFT(query, 80),
calls,
ROUND(mean_exec_time::numeric, 2) as avg_ms,
blk_read_time,
blk_write_time
FROM pg_stat_statements
WHERE blk_read_time + blk_write_time > 0
ORDER BY blk_read_time + blk_write_time DESC
LIMIT 10;
-- Query causing most memory pressure
SELECT
LEFT(query, 80),
calls,
ROUND(mean_exec_time::numeric, 2) as avg_ms
FROM pg_stat_statements
WHERE query ILIKE '%GROUP BY%' OR query ILIKE '%ORDER BY%'
ORDER BY mean_exec_time DESC
LIMIT 10;
-- Monitor statement history
SELECT
userid::regrole,
datname,
query,
calls,
total_exec_time
FROM pg_stat_statements
JOIN pg_database ON pg_stat_statements.dbid = pg_database.oid
WHERE datname = 'myapp'
ORDER BY total_exec_time DESC;
Connection Overhead vs PgBouncer Pooling
Each database connection consumes 5-10MB RAM. Applications opening 200+ connections cause memory bloat and slow new connection acquisition.
# pgbouncer.ini configuration for production
[databases]
myapp = host=prod-db port=5432 dbname=myapp user=pgbouncer password=$PG_PASSWORD
[pgbouncer]
listen_port = 6432
listen_addr = 127.0.0.1
auth_type = md5
pool_mode = transaction # transaction pooling for web apps (most common)
max_client_conn = 1000
default_pool_size = 25 # connections per database
min_pool_size = 10
reserve_pool_size = 5
reserve_pool_timeout = 3
max_db_connections = 200 # hard limit on database
max_user_connections = 100
# Performance tuning
server_idle_timeout = 600
server_lifetime = 3600
server_connect_timeout = 15
idle_in_transaction_session_timeout = 300
# Logging
log_connections = 1
log_disconnections = 1
log_pooler_errors = 1
// Node.js client using PgBouncer instead of direct connection
import pg from 'pg';
const pool = new pg.Pool({
host: 'localhost',
port: 6432, // PgBouncer port, not direct database
database: 'myapp',
user: 'app_user',
password: process.env.DB_PASSWORD,
max: 20, // connection pool size (smaller, PgBouncer handles multiplexing)
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 5000,
});
// Reuse connections efficiently
async function fetchUser(userId: string) {
const client = await pool.connect();
try {
const result = await client.query('SELECT * FROM users WHERE id = $1', [userId]);
return result.rows[0];
} finally {
client.release(); // return to pool
}
}
// Monitor pool status
pool.on('error', (err) => {
console.error('Unexpected pool error:', err);
});
pool.on('connect', () => {
console.log('New connection acquired from PgBouncer');
});
// Health check
async function healthCheck() {
const client = await pool.connect();
try {
await client.query('SELECT 1');
return { healthy: true };
} catch (err) {
return { healthy: false, error: err.message };
} finally {
client.release();
}
}
shared_buffers and effective_cache_size Tuning
shared_buffers is PostgreSQL's cache. effective_cache_size hints to the planner about OS page cache.
-- Current settings
SHOW shared_buffers; -- Default: 128MB (too small)
SHOW effective_cache_size; -- Default: 4GB (usually too low)
-- Tuning guidelines:
-- shared_buffers = 25% of total RAM (max 40GB on modern hardware)
-- effective_cache_size = 75% of total RAM
-- Example for 32GB RAM:
-- shared_buffers = 8GB
-- effective_cache_size = 24GB
-- PostgreSQL config (restart required)
# postgresql.conf
shared_buffers = 8GB
effective_cache_size = 24GB
-- Verify buffer hit ratio (should be > 99%)
SELECT
sum(heap_blks_read) as heap_read,
sum(heap_blks_hit) as heap_hit,
sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio
FROM pg_statio_user_tables;
-- If ratio < 99%: increase shared_buffers or add indexes
-- Index buffer hit ratio
SELECT
sum(idx_blks_read) as idx_read,
sum(idx_blks_hit) as idx_hit,
sum(idx_blks_hit) / (sum(idx_blks_hit) + sum(idx_blks_read)) as ratio
FROM pg_statio_user_indexes;
-- Monitor actual buffer pool usage
SELECT
datname,
count(*) as buffer_blocks,
pg_size_pretty(count() * 8192) as cache_size
FROM pg_buffercache
WHERE usagecount > 0
GROUP BY datname
ORDER BY count() DESC;
Checklist
- EXPLAIN ANALYZE run on all slow queries
- Index scans preferred over seq scans where appropriate
- Hash joins detected and optimized
- work_mem tuned based on server capacity and workload
- Checkpoint settings optimized for write patterns
- Autovacuum configuration aggressive for high-write tables
- pg_stat_statements enabled and monitored weekly
- PgBouncer deployed for connection pooling
- shared_buffers set to 25% of RAM
- effective_cache_size set to 75% of RAM
- Buffer hit ratio > 99%
- Slow query threshold configured in logging
Conclusion
PostgreSQL performance tuning is empirical and iterative. Start with pg_stat_statements to find the real bottlenecks, use EXPLAIN ANALYZE to understand why queries are slow, and configure memory and connection pooling appropriately for your hardware. Most improvements come from index additions and tuning autovacuum, not parameter tweaking. Monitor your metrics continuously and adjust as your data growth and access patterns evolve.