- Published on
Read Replica Lag — Why Your Users See Stale Data After Saving
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
Adding a read replica is one of the first database scaling steps teams take — offload SELECTs from the primary, double your read throughput. But replication is asynchronous. The replica is always slightly behind the primary — usually milliseconds, sometimes seconds, occasionally minutes if the primary is under heavy write load. Your application needs to account for this lag, or users see confusing stale data.
- The Problem
- Fix 1: Route Writes and Immediate Reads to Primary
- Fix 2: Sticky Sessions — Route User to Same Source After Write
- Fix 3: Check Replication Lag Before Reading
- Fix 4: Synchronous Replication for Critical Data
- Monitoring Replication Lag
- Read Replica Lag Checklist
- Conclusion
The Problem
T=0.000 User saves profile: UPDATE users SET name = 'Sanjeev' WHERE id = '123'
Write goes to PRIMARY ✅
T=0.001 User's browser reloads profile
SELECT name FROM users WHERE id = '123'
Read goes to REPLICA
T=0.001 Replica replication lag: 1.5 seconds
Replica still has name = 'Old Name'
T=0.001 Response: { name: 'Old Name' } ❌ User sees old data
User thinks their save didn't work
User saves again
3x support ticket
Fix 1: Route Writes and Immediate Reads to Primary
The simplest fix: after a write, read from the primary for the current request:
class DatabaseRouter {
constructor(
private primary: Pool,
private replica: Pool
) {}
// Reads that need fresh data — use primary
async readAfterWrite<T>(query: string, params?: any[]): Promise<T> {
return this.primary.query(query, params)
}
// Reads that can tolerate stale data — use replica
async read<T>(query: string, params?: any[]): Promise<T> {
return this.replica.query(query, params)
}
// All writes go to primary
async write<T>(query: string, params?: any[]): Promise<T> {
return this.primary.query(query, params)
}
}
// In your service
async function updateProfile(userId: string, data: UpdateProfileDto) {
// Write to primary
await db.write('UPDATE users SET name = $1 WHERE id = $2', [data.name, userId])
// Read back from primary (read-after-write consistency)
const updated = await db.readAfterWrite(
'SELECT * FROM users WHERE id = $1',
[userId]
)
return updated
}
// For listing endpoints where stale data is acceptable:
async function getRecentPosts() {
return db.read('SELECT * FROM posts ORDER BY created_at DESC LIMIT 20')
}
Fix 2: Sticky Sessions — Route User to Same Source After Write
Track "this user just wrote something" and route them to the primary for a short window:
import { Request, Response, NextFunction } from 'express'
const WRITE_STICKY_DURATION_MS = 5000 // 5 seconds
// Middleware: after a write, mark the user as "recently wrote"
function markRecentWrite(userId: string) {
// Store in Redis with 5-second TTL
return redis.set(`recent_write:${userId}`, '1', 'PX', WRITE_STICKY_DURATION_MS)
}
async function shouldUsePrimary(userId: string): Promise<boolean> {
const recentWrite = await redis.get(`recent_write:${userId}`)
return recentWrite !== null
}
// In your profile update handler
app.put('/users/profile', async (req, res) => {
const userId = req.user.id
await db.primary.query('UPDATE users SET name = $1 WHERE id = $2',
[req.body.name, userId])
await markRecentWrite(userId) // Mark: this user needs primary reads for 5 seconds
res.json({ success: true })
})
// In your profile read handler
app.get('/users/profile', async (req, res) => {
const userId = req.user.id
const pool = (await shouldUsePrimary(userId)) ? db.primary : db.replica
const user = await pool.query('SELECT * FROM users WHERE id = $1', [userId])
res.json(user.rows[0])
})
Fix 3: Check Replication Lag Before Reading
async function getReplicaLagSeconds(): Promise<number> {
// Query the replica itself for its lag
const result = await replica.query(`
SELECT EXTRACT(EPOCH FROM (NOW() - pg_last_xact_replay_timestamp())) AS lag_seconds
`)
return parseFloat(result.rows[0]?.lag_seconds ?? '0')
}
async function queryWithFallback<T>(
query: string,
params: any[],
maxLagSeconds = 2
): Promise<T> {
const lag = await getReplicaLagSeconds()
if (lag > maxLagSeconds) {
console.warn(`Replica lag ${lag}s > ${maxLagSeconds}s — routing to primary`)
return primary.query(query, params)
}
return replica.query(query, params)
}
Fix 4: Synchronous Replication for Critical Data
For data where even milliseconds of lag are unacceptable, use synchronous replication:
-- PostgreSQL: require replication acknowledgment before commit returns
-- postgresql.conf on primary:
synchronous_standby_names = 'replica-1'
synchronous_commit = on
-- This makes EVERY write wait for the replica to acknowledge
-- Slower (adds replica round-trip to every write)
-- But: replica is always up to date when the write succeeds
Or, more targeted — only synchronize critical transactions:
async function criticalWrite(query: string, params: any[]) {
const client = await primary.connect()
try {
// Override session to require sync replication for this transaction
await client.query("SET synchronous_commit = 'remote_apply'")
const result = await client.query(query, params)
return result
} finally {
client.release()
}
}
Monitoring Replication Lag
-- On the primary — check all replica lag
SELECT
application_name,
state,
sync_state,
write_lag,
flush_lag,
replay_lag
FROM pg_stat_replication;
-- On the replica — check its own lag
SELECT
NOW() - pg_last_xact_replay_timestamp() AS replication_lag,
pg_is_in_recovery() AS is_replica
// Alert if replica lag exceeds threshold
cron.schedule('*/1 * * * *', async () => {
const lag = await getReplicaLagSeconds()
if (lag > 10) {
await alerting.critical(`Replica lag: ${lag}s — reads are significantly stale`)
} else if (lag > 2) {
await alerting.warn(`Replica lag: ${lag}s`)
}
})
Read Replica Lag Checklist
- ✅ Always read-after-write from primary for data just written
- ✅ Use sticky session (Redis TTL) to route recent writers to primary
- ✅ Monitor
pg_last_xact_replay_timestamp()on replicas - ✅ Alert if lag exceeds 2 seconds
- ✅ Use synchronous replication for financial transactions
- ✅ Never route writes to replicas (they're read-only)
- ✅ Design UIs to handle eventual consistency (optimistic updates)
Conclusion
Read replica lag is a fundamental property of async replication — you can manage it but not eliminate it. The most practical fix is routing reads back to the primary for a short window after any write (sticky sessions with a 5-second Redis TTL). For dashboards, listing pages, and search where slight staleness is acceptable, replicas are great. For read-after-write scenarios — profile pages, balance displays, anything the user just modified — use the primary. Monitor lag constantly and have an automatic fallback to primary when lag exceeds your threshold.