- Published on
Unbounded Table Growth — When Your Database Fills the Disk at 3 AM
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
Unbounded table growth is one of those "it'll never be a problem" problems that becomes a 3 AM incident. The table exists for a legitimate reason. The rows are small. But no one planned for data retention, and the data just keeps accumulating until the disk fills up — at which point the database stops accepting writes entirely.
- Tables That Commonly Grow Without Bounds
- Fix 1: Scheduled Purge Job
- Fix 2: PostgreSQL Table Partitioning by Time
- Fix 3: TTL Column with a Purge Index
- Fix 4: Monitor Table Size
- Data Retention Policy
- Unbounded Growth Checklist
- Conclusion
Tables That Commonly Grow Without Bounds
- sessions / auth_tokens
- events / analytics
- audit_logs / activity_logs
- notifications (read and unread)
- email_queue / job_queue (processed jobs not deleted)
- cache_entries (TTL not enforced at DB level)
- webhook_deliveries
- api_request_logs
These tables share a pattern: rows are written frequently, old rows are rarely needed, and no one added a cleanup job.
Fix 1: Scheduled Purge Job
import cron from 'node-cron'
// Run daily at 2 AM — delete old data in small batches
cron.schedule('0 2 * * *', async () => {
await purgeOldSessions()
await purgeOldEvents()
await purgeProcessedJobs()
})
async function purgeOldSessions() {
const cutoff = new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) // 30 days ago
let deleted = 0
let batch: number
do {
// Delete in small batches to avoid long-running transactions
const result = await db.query(`
DELETE FROM sessions
WHERE id IN (
SELECT id FROM sessions
WHERE expires_at < $1 OR last_active < $2
LIMIT 1000
)
`, [new Date(), cutoff])
batch = result.rowCount ?? 0
deleted += batch
if (batch > 0) {
await sleep(100) // brief pause between batches
}
} while (batch > 0)
console.log(`Purged ${deleted} expired sessions`)
}
Fix 2: PostgreSQL Table Partitioning by Time
For very large tables, partitioning lets you drop old data instantly without a DELETE:
-- Create a partitioned table (range partition by month)
CREATE TABLE events (
id UUID DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
event_type VARCHAR(100) NOT NULL,
meta JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (created_at);
-- Create monthly partitions
CREATE TABLE events_2026_01 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE events_2026_03 PARTITION OF events
FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');
-- Dropping an old partition is INSTANT (unlike DELETE)
-- No need to scan rows — just removes the file
DROP TABLE events_2024_01; -- Instantly frees all space for that month
// Auto-create next month's partition
async function ensureNextPartition() {
const nextMonth = new Date()
nextMonth.setMonth(nextMonth.getMonth() + 1)
const year = nextMonth.getFullYear()
const month = String(nextMonth.getMonth() + 1).padStart(2, '0')
const nextMonthStr = `${year}-${month}-01`
const twoMonthsLater = new Date(nextMonth)
twoMonthsLater.setMonth(twoMonthsLater.getMonth() + 1)
const twoMonthsStr = `${twoMonthsLater.getFullYear()}-${String(twoMonthsLater.getMonth() + 1).padStart(2, '0')}-01`
await db.query(`
CREATE TABLE IF NOT EXISTS events_${year}_${month}
PARTITION OF events
FOR VALUES FROM ('${nextMonthStr}') TO ('${twoMonthsStr}')
`)
}
// Run monthly — create next month's partition in advance
cron.schedule('0 0 1 * *', ensureNextPartition)
// Drop old partitions
async function dropOldPartitions(keepMonths = 6) {
const cutoff = new Date()
cutoff.setMonth(cutoff.getMonth() - keepMonths)
const year = cutoff.getFullYear()
const month = String(cutoff.getMonth() + 1).padStart(2, '0')
await db.query(`DROP TABLE IF EXISTS events_${year}_${month}`)
}
Fix 3: TTL Column with a Purge Index
When you can't partition, a purge index makes batch deletes fast:
-- Add expires_at column
ALTER TABLE sessions ADD COLUMN expires_at TIMESTAMPTZ NOT NULL
DEFAULT NOW() + INTERVAL '30 days';
-- Index for efficient purge queries
CREATE INDEX ON sessions (expires_at) WHERE expires_at < NOW();
-- Purge query is fast because it uses the index
DELETE FROM sessions WHERE expires_at < NOW();
Fix 4: Monitor Table Size
Catch growth before it becomes a crisis:
-- Table sizes with row counts
SELECT
relname AS table_name,
n_live_tup AS row_count,
pg_size_pretty(pg_total_relation_size(oid)) AS total_size,
pg_size_pretty(pg_relation_size(oid)) AS table_size,
pg_size_pretty(pg_indexes_size(oid)) AS index_size
FROM pg_class
JOIN pg_stat_user_tables ON relid = oid
WHERE relkind = 'r'
ORDER BY pg_total_relation_size(oid) DESC
LIMIT 20;
// Alert when any table exceeds a size threshold
async function checkTableSizes() {
const result = await db.query(`
SELECT relname, pg_total_relation_size(oid) AS size_bytes
FROM pg_class
JOIN pg_stat_user_tables ON relid = oid
WHERE relkind = 'r'
`)
for (const row of result.rows) {
const gbSize = row.size_bytes / (1024 ** 3)
if (gbSize > 10) { // alert if any table exceeds 10GB
await alerting.warn(`Table ${row.relname} is ${gbSize.toFixed(1)}GB`)
}
}
}
cron.schedule('0 * * * *', checkTableSizes) // check hourly
Data Retention Policy
Before writing cleanup code, define the retention policy per table:
const RETENTION_POLICY = {
sessions: { keepDays: 30, purgeColumn: 'expires_at' },
events: { keepDays: 90, purgeColumn: 'created_at' },
audit_logs: { keepDays: 365, purgeColumn: 'created_at' },
notifications: { keepDays: 60, readOnly: true, purgeColumn: 'created_at' },
job_queue: { keepCompleted: 7, keepFailed: 30, purgeColumn: 'completed_at' },
}
// Job queue: keep failed jobs longer for debugging
async function purgeJobQueue() {
// Delete completed jobs older than 7 days
await db.query(`
DELETE FROM jobs WHERE status = 'complete'
AND completed_at < NOW() - INTERVAL '7 days'
AND id IN (SELECT id FROM jobs WHERE status = 'complete' LIMIT 5000)
`)
// Delete failed jobs older than 30 days
await db.query(`
DELETE FROM jobs WHERE status = 'failed'
AND completed_at < NOW() - INTERVAL '30 days'
AND id IN (SELECT id FROM jobs WHERE status = 'failed' LIMIT 5000)
`)
}
Unbounded Growth Checklist
- ✅ Define data retention policy for every high-write table
- ✅ Add
expires_atorcreated_atcolumns with indexes for purge queries - ✅ Schedule daily purge jobs — delete in batches of 1,000-10,000
- ✅ Use table partitioning for tables exceeding 100GB
- ✅ Monitor table sizes hourly, alert at 50% and 80% disk usage
- ✅ Test purge jobs in staging with production-sized data
- ✅ Set PostgreSQL disk alerts before they reach 100%
Conclusion
Unbounded table growth is entirely preventable with a data retention policy and a scheduled cleanup job. The time to add both is when you create the table, not when the disk is 95% full. For high-volume tables, partitioning by time is the cleanest solution: dropping an old partition is instant (no row-by-row delete), and queries that filter by date automatically use partition pruning. At minimum, add a created_at index and schedule a daily purge job — that's all it takes to prevent a 3 AM disk-full incident.