Docker Best Practices — Production Checklist

Sanjeev SharmaSanjeev Sharma
5 min read

Advertisement

Docker Best Practices — Production Checklist

Running Docker in production requires careful attention to security, performance, and reliability. This checklist ensures your containers are production-ready.

Introduction

Production Docker deployments face unique challenges: security vulnerabilities, resource constraints, and operational monitoring. This guide covers essential best practices.

Image Security

Use Minimal Base Images

# Bad: large attack surface
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y nodejs

# Good: minimal base
FROM node:18-alpine

Alpine Linux reduces image size from 500MB+ to 50MB and minimizes vulnerabilities.

Scan Images for Vulnerabilities

# Using Trivy
trivy image myapp:1.0

# Using Docker Scout
docker scout cves myapp:1.0

# Using Snyk
snyk container test myapp:1.0

Don't Run as Root

# Create non-root user
FROM node:18-alpine
RUN addgroup -g 1001 -S nodejs && adduser -S nodejs -u 1001
USER nodejs

COPY --chown=nodejs:nodejs . .
CMD ["node", "server.js"]

Keep Secrets Out of Images

# Bad: secrets in image
ENV DATABASE_PASSWORD=secret123

# Good: pass at runtime
FROM node:18-alpine
# No hardcoded secrets
# Pass secrets at runtime
docker run -e DATABASE_PASSWORD=$(cat /run/secrets/db_password) myapp

Image Optimization

Multi-Stage Builds

# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Runtime stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package*.json ./

EXPOSE 3000
CMD ["node", "dist/server.js"]

Use .dockerignore

node_modules
npm-debug.log
.git
.gitignore
README.md
.env
.DS_Store
coverage
.next
dist
build
.vscode

Layer Caching

Order Dockerfile instructions from least to most frequently changed:

FROM node:18-alpine

# Stable layers first
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

# Variable layers last
COPY . .
RUN npm run build

EXPOSE 3000
CMD ["node", "server.js"]

Resource Management

Set Resource Limits

# docker-compose.yml
services:
  web:
    image: myapp:1.0
    deploy:
      resources:
        limits:
          cpus: '1'
          memory: 512M
        reservations:
          cpus: '0.5'
          memory: 256M
# docker run
docker run -m 512m --cpus 1 myapp:1.0

Health Checks

FROM node:18-alpine

WORKDIR /app
COPY . .
RUN npm ci --only=production

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD node healthcheck.js

CMD ["node", "server.js"]
// healthcheck.js
const http = require('http');

const req = http.get('http://localhost:3000/health', (res) => {
  if (res.statusCode === 200) {
    process.exit(0);
  } else {
    process.exit(1);
  }
});

req.on('error', () => process.exit(1));
setTimeout(() => process.exit(1), 5000);

Logging

Structured Logging

FROM node:18-alpine
WORKDIR /app
COPY . .
RUN npm ci --only=production

# Ensure logs go to stdout
ENV NODE_ENV=production

CMD ["node", "server.js"]
// Avoid logging to files in containers
// Instead, log to stdout/stderr

console.log(JSON.stringify({
  timestamp: new Date().toISOString(),
  level: 'info',
  message: 'Server started',
  port: 3000
}));

// Docker/Kubernetes will handle log collection

Access Logs

# View container logs
docker logs container_id

# Follow logs
docker logs -f container_id

# Get specific number of lines
docker logs --tail 100 container_id

# Include timestamps
docker logs -t container_id

Networking

Expose Only Necessary Ports

# Only expose application ports
EXPOSE 3000

# Don't expose debug ports in production
# EXPOSE 5858

Network Security

# docker-compose.yml
services:
  web:
    image: myapp:1.0
    ports:
      - "3000:3000"  # Only expose needed port
    networks:
      - internal

  db:
    image: postgres:15
    networks:
      - internal
    # Don't expose port; only web connects to it

networks:
  internal:
    driver: bridge

Data Management

Use Named Volumes

services:
  db:
    image: postgres:15
    volumes:
      # Named volume for backups
      - db_data:/var/lib/postgresql/data

volumes:
  db_data:
    driver: local

Backup Strategy

# Backup named volume
docker run --rm -v db_data:/data -v $(pwd):/backup \
  ubuntu tar czf /backup/db_backup.tar.gz -C /data .

# Restore from backup
docker run --rm -v db_data:/data -v $(pwd):/backup \
  ubuntu tar xzf /backup/db_backup.tar.gz -C /data

Production Deployment

Image Tagging

# Semantic versioning
docker build -t myapp:1.0.0 .
docker tag myapp:1.0.0 myregistry/myapp:1.0.0
docker tag myapp:1.0.0 myregistry/myapp:latest

# Push all tags
docker push myregistry/myapp --all-tags

Restart Policies

services:
  web:
    image: myapp:1.0
    restart: unless-stopped  # Restart unless explicitly stopped

Options:

  • no: Don't restart automatically
  • always: Always restart if stopped
  • unless-stopped: Always restart unless explicitly stopped
  • on-failure: Restart only on non-zero exit code

Environment Separation

# docker-compose.prod.yml
services:
  web:
    image: myregistry/myapp:1.0.0
    restart: unless-stopped
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 2G
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

Monitoring

Container Metrics

# View container stats
docker stats

# Save stats to file
docker stats --no-stream > metrics.txt

# Monitor specific container
docker stats container_name

Log Aggregation

# docker-compose.yml with logging driver
services:
  app:
    image: myapp:1.0
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

Production Checklist

  • Use non-root user
  • Scan images for vulnerabilities
  • Set resource limits and reservations
  • Configure health checks
  • Log to stdout/stderr
  • Use named volumes for data
  • Pin base image versions
  • Set appropriate restart policies
  • Configure log rotation
  • Use environment variables for secrets
  • Document all exposed ports
  • Implement proper monitoring
  • Plan backup strategy

FAQ

Q: Should I use latest tag in production? A: No. Always use specific semantic versions like 1.0.0 to ensure consistency and enable rollbacks.

Q: How do I handle database migrations? A: Run migrations in init containers before the main application starts, or use a separate job/init script.

Q: What's the recommended approach for secrets? A: Use Docker secrets in Swarm mode, or environment variables with a secrets management system in production.

Advertisement

Sanjeev Sharma

Written by

Sanjeev Sharma

Full Stack Engineer · E-mopro