- Published on
ArgoCD in Production — GitOps Deployment Pipelines That Actually Work
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
ArgoCD brings GitOps discipline to Kubernetes: a Git repository becomes your source of truth, and ArgoCD reconciles cluster state to match that repository. But GitOps at scale requires careful architecture. How do you manage multiple environments? How do you order deployments safely? What happens when ArgoCD itself fails? This post covers production patterns: App of Apps, ApplicationSet, sync waves, health checks, RBAC, Argo Rollouts for progressive delivery, and disaster recovery for ArgoCD itself.
- App of Apps Pattern
- ApplicationSet for Multi-Environment
- Sync Waves and Hooks for Ordered Deployments
- Health Checks and Custom Health
- RBAC with SSO Integration
- Progressive Delivery with Argo Rollouts
- Image Updater for Automated Image Promotion
- Disaster Recovery for ArgoCD
- Checklist
- Conclusion
App of Apps Pattern
The App of Apps pattern uses a root ArgoCD Application that manages child Applications. This scales to hundreds of applications and environments.
Repository structure:
flux-repo/
├── apps/
│ ├── api/
│ │ ├── kustomization.yaml
│ │ ├── deployment.yaml
│ │ └── service.yaml
│ ├── worker/
│ │ ├── kustomization.yaml
│ │ └── deployment.yaml
│ └── postgres/
│ ├── helm/
│ └── values.yaml
├── argocd/
│ ├── root-app.yaml
│ ├── api-app.yaml
│ ├── worker-app.yaml
│ └── postgres-app.yaml
└── environments/
├── dev/
│ └── kustomization.yaml
├── staging/
│ └── kustomization.yaml
└── prod/
└── kustomization.yaml
Root Application (root-app.yaml):
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: root-app
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/org/flux-repo
targetRevision: main
path: argocd/
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
Child Application (api-app.yaml):
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: api
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: https://github.com/org/flux-repo
targetRevision: main
path: apps/api
plugin:
name: kustomize
destination:
server: https://kubernetes.default.svc
namespace: default
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
revisionHistoryLimit: 10
Root app contains references to all child apps. When you merge to main, ArgoCD detects changes and syncs recursively.
ApplicationSet for Multi-Environment
ApplicationSet generates Applications dynamically. Manage dev, staging, and prod with a single template:
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: api-environments
namespace: argocd
spec:
generators:
- list:
elements:
- name: api
env: dev
cluster: dev-cluster
namespace: default
- name: api
env: staging
cluster: staging-cluster
namespace: production
- name: api
env: prod
cluster: prod-cluster
namespace: production
template:
metadata:
name: api-{{ env }}
labels:
app: api
env: "{{ env }}"
spec:
project: default
source:
repoURL: https://github.com/org/flux-repo
targetRevision: main
path: apps/api
kustomize:
overlays:
- overlays/{{ env }}
destination:
server: https://{{ cluster }}.example.com:443
namespace: "{{ namespace }}"
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
This creates three Applications (api-dev, api-staging, api-prod) from a single template. Updating the template updates all environments.
Sync Waves and Hooks for Ordered Deployments
Sync waves ensure dependencies deploy in order. A wave is a deployment phase; ArgoCD waits for one wave to succeed before starting the next.
apiVersion: batch/v1
kind: Job
metadata:
name: db-migrate
annotations:
argocd.argoproj.io/hook: PreSync
argocd.argoproj.io/hook-sync-wave: "0"
argocd.argoproj.io/hook-delete-policy: BeforeSyncFinished
spec:
backoffLimit: 3
template:
spec:
serviceAccountName: db-migrator
restartPolicy: Never
containers:
- name: migrate
image: myapp:v1.2.3
command: ["/app/migrate.sh"]
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
annotations:
argocd.argoproj.io/hook-sync-wave: "1"
spec:
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: myapp:v1.2.3
ports:
- containerPort: 8080
---
apiVersion: batch/v1
kind: Job
metadata:
name: smoke-test
annotations:
argocd.argoproj.io/hook: PostSync
argocd.argoproj.io/hook-sync-wave: "2"
argocd.argoproj.io/hook-delete-policy: BeforeSyncFinished
spec:
backoffLimit: 1
template:
spec:
restartPolicy: Never
containers:
- name: test
image: myapp:v1.2.3
command: ["/app/smoke-test.sh"]
Execution order:
- Wave 0 (PreSync): Database migration runs
- Wave 1 (Sync): Deployment created
- Wave 2 (PostSync): Smoke tests run
Only when each wave succeeds does the next begin.
Health Checks and Custom Health
ArgoCD observes resource health (Running, Progressing, Healthy, Degraded). Define custom health rules:
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-cm
namespace: argocd
data:
resource.customizations.health.certmanager.k8s.io_Certificate: |
hs = {}
if obj.status ~= nil then
if obj.status.conditions ~= nil then
for i, condition in ipairs(obj.status.conditions) do
if condition.type == "Ready" and condition.status == "False" then
hs.status = "Degraded"
hs.message = condition.message
return hs
end
if condition.type == "Ready" and condition.status == "True" then
hs.status = "Healthy"
hs.message = condition.message
return hs
end
end
end
end
hs.status = "Progressing"
hs.message = "Waiting for certificate to be ready"
return hs
This teaches ArgoCD how to assess cert-manager Certificate health.
RBAC with SSO Integration
Restrict access per team and environment:
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-rbac-cm
namespace: argocd
data:
policy.csv: |
p, role:admin, applications, get, */*, allow
p, role:admin, applications, sync, */*, allow
p, role:admin, repositories, get, *, allow
p, role:admin, repositories, create, *, allow
p, role:dev-team, applications, get, dev/*, allow
p, role:dev-team, applications, sync, dev/*, allow
p, role:dev-team, repositories, get, *, allow
p, role:staging-team, applications, get, staging/*, allow
p, role:staging-team, applications, sync, staging/*, allow
p, role:staging-team, repositories, get, *, allow
p, role:prod-oncall, applications, sync, prod/*, allow
g, org:platform, role:admin
g, org:data-eng, role:dev-team
g, org:backend, role:staging-team
g, oncall-team, role:prod-oncall
Map OIDC groups from your identity provider:
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-cm
namespace: argocd
data:
oidc.config: |
name: Okta
issuer: https://org.okta.com
clientID: $OIDC_CLIENT_ID
clientSecret: $OIDC_CLIENT_SECRET
requestedScopes:
- openid
- profile
- email
- groups
claimsMapping:
groups: groups
Progressive Delivery with Argo Rollouts
Argo Rollouts replace Deployments with Rollout objects, supporting canary, blue-green, and traffic-weighted deployments:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: api
spec:
replicas: 5
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: myapp:v1.2.3
ports:
- containerPort: 8080
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- api
topologyKey: kubernetes.io/hostname
strategy:
canary:
steps:
- setWeight: 10
- pause:
duration: 5m
- setWeight: 25
- pause:
duration: 5m
- setWeight: 50
- pause:
duration: 5m
- setWeight: 75
- pause:
duration: 5m
trafficWeight:
canary:
weight: 50
stable:
weight: 50
analysis:
templates:
- name: success-rate
interval: 30s
threshold: 5
startingStep: 1
---
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
spec:
metrics:
- name: success-rate
interval: 30s
failureLimit: 3
provider:
prometheus:
address: http://prometheus:9090
query: |
sum(rate(api_requests_total{status="200"}[5m])) / sum(rate(api_requests_total[5m]))
thresholdResults:
min: "0.95"
This canary gradually shifts traffic to the new version. If success rate falls below 95%, it aborts and rolls back.
Image Updater for Automated Image Promotion
ArgoCD can watch container registries and auto-update image tags:
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-image-updater-config
namespace: argocd
data:
registries.conf: |
registries:
- name: Docker Hub
api_url: https://registry-1.docker.io
ping: yes
credentials: secret:argocd/docker-hub#creds
- name: ECR
api_url: https://ecr.us-east-1.amazonaws.com
ping: no
credentials: secret:argocd/ecr#creds
Annotate your Kustomization:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
metadata:
annotations:
argocd-image-updater.argoproj.io/image-list: myapp=docker.io/org/myapp
argocd-image-updater.argoproj.io/myapp.update-strategy: latest
argocd-image-updater.argoproj.io/write-back-method: git:secret:argocd/image-updater-token
argocd-image-updater.argoproj.io/git-branch: main
images:
- name: myapp
newName: docker.io/org/myapp
newTag: v1.2.3
When a new image is pushed, Image Updater bumps the tag and commits to Git. ArgoCD syncs the change.
Disaster Recovery for ArgoCD
ArgoCD is mission-critical infrastructure. Plan for its failure:
Backup ArgoCD state:
# Backup all ArgoCD data
kubectl exec -it argocd-server-0 -n argocd -- \
argocd-util export > argocd-backup.yaml
# Store in version control
git add argocd-backup.yaml && git commit -m "ArgoCD backup"
Recovery playbook:
# 1. Reinstall ArgoCD
helm repo add argo https://argoproj.github.io/argo-helm
helm install argocd argo/argo-cd \
-n argocd \
--create-namespace \
-f values.yaml
# 2. Restore state
kubectl create -f argocd-backup.yaml
# 3. Re-sync Applications
argocd app sync --all
# 4. Verify health
argocd app list
High availability setup:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: argocd-server
spec:
serviceName: argocd-server
replicas: 3
selector:
matchLabels:
app.kubernetes.io/name: argocd-server
template:
metadata:
labels:
app.kubernetes.io/name: argocd-server
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- argocd-server
topologyKey: kubernetes.io/hostname
containers:
- name: argocd-server
image: quay.io/argoproj/argocd:v2.9.0
command:
- argocd-server
- --insecure
- --disable-auth
Checklist
- App of Apps pattern deployed with root-app managing all child Applications
- ApplicationSet configured for multi-environment deployments (dev, staging, prod)
- Sync waves defined for ordered resource deployment (migrations → apps → tests)
- Health checks configured for custom resource types (CRDs, cert-manager, etc.)
- RBAC policies restrict access by team and environment
- SSO/OIDC integrated for centralized identity management
- Argo Rollouts deployed for canary/blue-green progressive delivery
- Automated image promotions via Image Updater
- ArgoCD HA deployment with multiple replicas
- Backup and disaster recovery procedures documented and tested
- Git repository is single source of truth; no manual kubectl apply
- Notifications configured for sync failures and drift detected
Conclusion
ArgoCD transforms deployment from imperative scripting to declarative GitOps. The App of Apps pattern scales to hundreds of applications. ApplicationSet eliminates environment duplication. Sync waves ensure safe, ordered rollouts. Progressive delivery with Argo Rollouts catches regressions before affecting all users. Finally, treat ArgoCD as critical infrastructure: back it up, run it with redundancy, and practice disaster recovery regularly. With this foundation, your deployments become auditable, reproducible, and automated.