Docker Best Practices — Production Checklist
Advertisement
Docker Best Practices — Production Checklist
Running Docker in production requires careful attention to security, performance, and reliability. This checklist ensures your containers are production-ready.
Introduction
Production Docker deployments face unique challenges: security vulnerabilities, resource constraints, and operational monitoring. This guide covers essential best practices.
- Docker Best Practices — Production Checklist
- Image Security
- Use Minimal Base Images
- Scan Images for Vulnerabilities
- Don't Run as Root
- Keep Secrets Out of Images
- Image Optimization
- Multi-Stage Builds
- Use .dockerignore
- Layer Caching
- Resource Management
- Set Resource Limits
- Health Checks
- Logging
- Structured Logging
- Access Logs
- Networking
- Expose Only Necessary Ports
- Network Security
- Data Management
- Use Named Volumes
- Backup Strategy
- Production Deployment
- Image Tagging
- Restart Policies
- Environment Separation
- Monitoring
- Container Metrics
- Log Aggregation
- Production Checklist
- FAQ
Image Security
Use Minimal Base Images
# Bad: large attack surface
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y nodejs
# Good: minimal base
FROM node:18-alpine
Alpine Linux reduces image size from 500MB+ to 50MB and minimizes vulnerabilities.
Scan Images for Vulnerabilities
# Using Trivy
trivy image myapp:1.0
# Using Docker Scout
docker scout cves myapp:1.0
# Using Snyk
snyk container test myapp:1.0
Don't Run as Root
# Create non-root user
FROM node:18-alpine
RUN addgroup -g 1001 -S nodejs && adduser -S nodejs -u 1001
USER nodejs
COPY . .
CMD ["node", "server.js"]
Keep Secrets Out of Images
# Bad: secrets in image
ENV DATABASE_PASSWORD=secret123
# Good: pass at runtime
FROM node:18-alpine
# No hardcoded secrets
# Pass secrets at runtime
docker run -e DATABASE_PASSWORD=$(cat /run/secrets/db_password) myapp
Image Optimization
Multi-Stage Builds
# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Runtime stage
FROM node:18-alpine
WORKDIR /app
COPY /app/dist ./dist
COPY /app/node_modules ./node_modules
COPY /app/package*.json ./
EXPOSE 3000
CMD ["node", "dist/server.js"]
Use .dockerignore
node_modules
npm-debug.log
.git
.gitignore
README.md
.env
.DS_Store
coverage
.next
dist
build
.vscode
Layer Caching
Order Dockerfile instructions from least to most frequently changed:
FROM node:18-alpine
# Stable layers first
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
# Variable layers last
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["node", "server.js"]
Resource Management
Set Resource Limits
# docker-compose.yml
services:
web:
image: myapp:1.0
deploy:
resources:
limits:
cpus: '1'
memory: 512M
reservations:
cpus: '0.5'
memory: 256M
# docker run
docker run -m 512m --cpus 1 myapp:1.0
Health Checks
FROM node:18-alpine
WORKDIR /app
COPY . .
RUN npm ci --only=production
HEALTHCHECK \
CMD node healthcheck.js
CMD ["node", "server.js"]
// healthcheck.js
const http = require('http');
const req = http.get('http://localhost:3000/health', (res) => {
if (res.statusCode === 200) {
process.exit(0);
} else {
process.exit(1);
}
});
req.on('error', () => process.exit(1));
setTimeout(() => process.exit(1), 5000);
Logging
Structured Logging
FROM node:18-alpine
WORKDIR /app
COPY . .
RUN npm ci --only=production
# Ensure logs go to stdout
ENV NODE_ENV=production
CMD ["node", "server.js"]
// Avoid logging to files in containers
// Instead, log to stdout/stderr
console.log(JSON.stringify({
timestamp: new Date().toISOString(),
level: 'info',
message: 'Server started',
port: 3000
}));
// Docker/Kubernetes will handle log collection
Access Logs
# View container logs
docker logs container_id
# Follow logs
docker logs -f container_id
# Get specific number of lines
docker logs --tail 100 container_id
# Include timestamps
docker logs -t container_id
Networking
Expose Only Necessary Ports
# Only expose application ports
EXPOSE 3000
# Don't expose debug ports in production
# EXPOSE 5858
Network Security
# docker-compose.yml
services:
web:
image: myapp:1.0
ports:
- "3000:3000" # Only expose needed port
networks:
- internal
db:
image: postgres:15
networks:
- internal
# Don't expose port; only web connects to it
networks:
internal:
driver: bridge
Data Management
Use Named Volumes
services:
db:
image: postgres:15
volumes:
# Named volume for backups
- db_data:/var/lib/postgresql/data
volumes:
db_data:
driver: local
Backup Strategy
# Backup named volume
docker run --rm -v db_data:/data -v $(pwd):/backup \
ubuntu tar czf /backup/db_backup.tar.gz -C /data .
# Restore from backup
docker run --rm -v db_data:/data -v $(pwd):/backup \
ubuntu tar xzf /backup/db_backup.tar.gz -C /data
Production Deployment
Image Tagging
# Semantic versioning
docker build -t myapp:1.0.0 .
docker tag myapp:1.0.0 myregistry/myapp:1.0.0
docker tag myapp:1.0.0 myregistry/myapp:latest
# Push all tags
docker push myregistry/myapp --all-tags
Restart Policies
services:
web:
image: myapp:1.0
restart: unless-stopped # Restart unless explicitly stopped
Options:
no: Don't restart automaticallyalways: Always restart if stoppedunless-stopped: Always restart unless explicitly stoppedon-failure: Restart only on non-zero exit code
Environment Separation
# docker-compose.prod.yml
services:
web:
image: myregistry/myapp:1.0.0
restart: unless-stopped
deploy:
resources:
limits:
cpus: '2'
memory: 2G
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
Monitoring
Container Metrics
# View container stats
docker stats
# Save stats to file
docker stats --no-stream > metrics.txt
# Monitor specific container
docker stats container_name
Log Aggregation
# docker-compose.yml with logging driver
services:
app:
image: myapp:1.0
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
Production Checklist
- Use non-root user
- Scan images for vulnerabilities
- Set resource limits and reservations
- Configure health checks
- Log to stdout/stderr
- Use named volumes for data
- Pin base image versions
- Set appropriate restart policies
- Configure log rotation
- Use environment variables for secrets
- Document all exposed ports
- Implement proper monitoring
- Plan backup strategy
FAQ
Q: Should I use latest tag in production? A: No. Always use specific semantic versions like 1.0.0 to ensure consistency and enable rollbacks.
Q: How do I handle database migrations? A: Run migrations in init containers before the main application starts, or use a separate job/init script.
Q: What's the recommended approach for secrets? A: Use Docker secrets in Swarm mode, or environment variables with a secrets management system in production.
Advertisement