Published on

Log Table Filling Disk — When Your Audit Trail Becomes a Crisis

Authors

Introduction

Audit logging is important: you need to know who changed what and when. So you create a table, write to it on every mutation, and forget about it. Eighteen months later it's 200GB, your disk is at 94%, and every query that touches a join with audit_logs takes 3x longer. The data is there, but the infrastructure is paying for it.

Why Log Tables Grow So Fast

100,000 users
Each user: 20 actions/day (updates, views, approvals)
= 2 million audit_log rows/day
= 60 million rows/month
= 720 million rows/year

Each row: 500 bytes (id, user_id, table, action, old_values JSONB, new_values JSONB, timestamp)
= 360GB/year

Most applications don't need all of this data forever. Compliance regulations typically require 1-7 years. Everything older is pure storage cost.

Fix 1: Time-Based Partitioning

The cleanest solution for audit logs specifically — partition by month, drop old partitions instantly:

CREATE TABLE audit_logs (
  id          BIGSERIAL,
  user_id     UUID NOT NULL,
  resource    VARCHAR(100) NOT NULL,
  resource_id UUID NOT NULL,
  action      VARCHAR(50) NOT NULL,  -- 'created', 'updated', 'deleted'
  old_values  JSONB,
  new_values  JSONB,
  ip_address  INET,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (created_at);

-- Current and recent months (keep 24 months = 2 years)
CREATE TABLE audit_logs_2026_01 PARTITION OF audit_logs
  FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE audit_logs_2026_02 PARTITION OF audit_logs
  FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

-- Indexes per partition (automatically created on parent, inherited by children)
CREATE INDEX ON audit_logs (user_id, created_at DESC);
CREATE INDEX ON audit_logs (resource, resource_id, created_at DESC);

Dropping 2-year-old data:

-- Instant — no scanning, just file removal
DROP TABLE audit_logs_2024_01;
DROP TABLE audit_logs_2024_02;

Fix 2: Archive Cold Data to Object Storage

If you need data beyond 2 years for compliance but don't need it queryable, archive it:

import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3'

const s3 = new S3Client({ region: 'us-east-1' })

async function archiveOldAuditLogs(olderThanMonths = 24) {
  const cutoff = new Date()
  cutoff.setMonth(cutoff.getMonth() - olderThanMonths)

  // Determine which month partition to archive
  const archiveYear = cutoff.getFullYear()
  const archiveMonth = String(cutoff.getMonth() + 1).padStart(2, '0')
  const partitionName = `audit_logs_${archiveYear}_${archiveMonth}`

  // Export to JSONL
  const stream = await db.query(`
    SELECT * FROM ${partitionName}
    ORDER BY created_at
  `)

  const jsonl = stream.rows
    .map((row) => JSON.stringify(row))
    .join('\n')

  // Upload to S3
  await s3.send(new PutObjectCommand({
    Bucket: process.env.AUDIT_ARCHIVE_BUCKET!,
    Key: `audit-logs/${archiveYear}/${archiveMonth}/audit_logs.jsonl.gz`,
    Body: Buffer.from(jsonl),
    ContentEncoding: 'gzip',
    ContentType: 'application/jsonl',
  }))

  console.log(`Archived ${stream.rows.length} rows to S3`)

  // Now safe to drop
  await db.query(`DROP TABLE IF EXISTS ${partitionName}`)
  console.log(`Dropped partition ${partitionName}`)
}

// Run monthly
cron.schedule('0 3 1 * *', () => archiveOldAuditLogs(24))

Fix 3: Separate Audit Database

Keep audit logs out of your primary database entirely — they have different scaling characteristics:

// Use a separate connection pool for audit logs
const auditDb = new Pool({
  connectionString: process.env.AUDIT_DATABASE_URL,  // separate RDS instance
})

const primaryDb = new Pool({
  connectionString: process.env.DATABASE_URL,
})

// Audit writes go to audit DB — don't compete with primary DB resources
async function logAuditEvent(event: AuditEvent) {
  await auditDb.query(`
    INSERT INTO audit_logs (user_id, resource, resource_id, action, old_values, new_values, ip_address)
    VALUES ($1, $2, $3, $4, $5, $6, $7)
  `, [event.userId, event.resource, event.resourceId, event.action,
      JSON.stringify(event.oldValues), JSON.stringify(event.newValues), event.ip])
}

The audit database can have different hardware (cheaper, high-storage), different retention policies, and different backup strategies — without affecting your primary database's performance.

Fix 4: Write Audit Logs to a Log Service

For truly high-volume audit trails, use a purpose-built log store rather than a SQL table:

// Push to Elasticsearch/OpenSearch — designed for write-heavy log workloads
import { Client } from '@elastic/elasticsearch'

const es = new Client({ node: process.env.ELASTICSEARCH_URL })

async function logAuditEvent(event: AuditEvent) {
  await es.index({
    index: `audit-logs-${new Date().toISOString().slice(0, 7)}`,  // audit-logs-2026-03
    document: {
      ...event,
      '@timestamp': new Date().toISOString(),
    },
  })
}

// Elasticsearch ILM (Index Lifecycle Management) handles rotation automatically:
// Hot tier: last 3 months (fast SSDs)
// Warm tier: 3-12 months (slower HDDs)
// Cold tier: 1-2 years (object storage)
// Delete: > 2 years

Monitoring Audit Log Table Size

-- Check partition sizes
SELECT
  child.relname AS partition,
  pg_size_pretty(pg_relation_size(child.oid)) AS size,
  pg_stat_user_tables.n_live_tup AS rows
FROM pg_inherits
JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
JOIN pg_class child ON pg_inherits.inhrelid = child.oid
JOIN pg_stat_user_tables ON pg_stat_user_tables.relname = child.relname
WHERE parent.relname = 'audit_logs'
ORDER BY child.relname;

Audit Log Checklist

  • ✅ Partition audit_logs table by month from the start
  • ✅ Define retention policy (e.g., 2 years in DB, 7 years in archive)
  • ✅ Archive cold partitions to S3/GCS before dropping
  • ✅ Run monthly partition creation job to pre-create next month
  • ✅ Use a separate database or log service for very high volume
  • ✅ Monitor disk usage with alerts at 70%, 85%, 95%
  • ✅ Never JOIN audit_logs in performance-critical queries

Conclusion

Audit logs are a compliance necessity but a storage liability. The solution is to treat them like time-series data: partition by month, archive old partitions to cheap object storage after the retention window, and drop the database partition. Start with partitioning when you create the table — retrofitting partitioning onto an existing 200GB table is a much harder problem. If your audit log volume is very high (millions of rows/day), a dedicated database or Elasticsearch is better than mixing logs with your primary OLTP workload.