- Published on
Zero Trust Architecture for Backend Systems — Never Trust, Always Verify
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
Zero trust is the security model for 2026. The old perimeter-based approach assumed network was safe. Modern systems are distributed, multi-cloud, and exposed to the internet. Zero trust assumes every request is untrusted. Every service-to-service call must be authenticated and authorised.
This post covers implementing zero trust for microservices: mutual TLS (mTLS), service identities, OPA policies, and short-lived credentials.
- Zero Trust Principles for Microservices
- mTLS Between Services with cert-manager
- SPIFFE/SPIRE for Service Identity
- OPA (Open Policy Agent) for Fine-Grained Authorisation
- Network Policies in Kubernetes
- Short-Lived Credentials with AWS IAM Roles for Service Accounts
- Secret Rotation Without Downtime
- Audit Logging Every Service-to-Service Call
- Zero Trust for AI Agents
- Checklist
- Conclusion
Zero Trust Principles for Microservices
- Never trust, always verify: Every request requires authentication, even on internal networks
- Least privilege: Services get only the permissions they need
- Inspect and log: Every interaction is logged and monitored
- Assume breach: Design for compromise. If one service is hacked, limit lateral movement
Traditional (perimeter-based):
User -> [Firewall] -> Internal Network -> Service A -> Service B
(Assume everything inside firewall is safe)
Zero trust:
User -[verify]-> Service A -[mTLS+policy]-> Service B
(Verify every hop, every request)
mTLS Between Services with cert-manager
Mutual TLS means both client and server verify each other with certificates. cert-manager automates certificate issuance and renewal.
Install cert-manager:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
Create a self-signed CA:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: internal-ca
namespace: cert-manager
spec:
secretName: internal-ca-key-pair
commonName: internal-ca
isCA: true
Create issuer:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: internal-ca
spec:
ca:
secretRef:
name: internal-ca-key-pair
Issue service certificates:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: api-service-cert
namespace: default
spec:
secretName: api-service-tls
issuerRef:
name: internal-ca
kind: ClusterIssuer
dnsNames:
- api.default.svc.cluster.local
Use mTLS in service pods:
// api-service/src/server.ts
import fs from 'fs';
import https from 'https';
const key = fs.readFileSync('/etc/tls/private/tls.key');
const cert = fs.readFileSync('/etc/tls/private/tls.crt');
const ca = fs.readFileSync('/etc/tls/private/ca.crt');
const options = {
key,
cert,
ca,
requestCert: true,
rejectUnauthorized: true,
};
https.createServer(options, app).listen(3000);
Pod manifest:
apiVersion: v1
kind: Pod
metadata:
name: api-service
spec:
containers:
- name: api
image: api:latest
volumeMounts:
- name: tls
mountPath: /etc/tls/private
readOnly: true
volumes:
- name: tls
secret:
secretName: api-service-tls
cert-manager rotates certificates automatically. No manual renewal.
SPIFFE/SPIRE for Service Identity
SPIFFE is a standard for issuing and managing service identities. SPIRE is the production implementation.
Install SPIRE:
helm install spire spiffe/spire -n spire --create-namespace
Service attestation:
apiVersion: spire.spiffe.io/v1alpha1
kind: ClusterStaticEntry
metadata:
name: api-service
namespace: default
spec:
spiffeID: spiffe://example.com/ns/default/sa/api
parentID: spiffe://example.com/spire/agent
federatesWith: []
Client requests SPIFFE credential:
const credential = await spire.requestSVID({
spiffeID: 'spiffe://example.com/ns/default/sa/api',
});
const response = await fetch('https://worker.svc.cluster.local/process', {
method: 'POST',
headers: {
'Authorization': `Bearer ${credential.jwt}`,
},
body: JSON.stringify(data),
});
Server verifies SPIFFE JWT:
import { verifyJWT } from 'spiffe';
app.use(async (req, res, next) => {
const token = req.headers.authorization?.split(' ')[1];
if (!token) return res.status(401).json({});
try {
const claims = await verifyJWT(token, publicKey);
req.serviceID = claims.sub; // spiffe://example.com/ns/default/sa/api
next();
} catch {
res.status(401).json({});
}
});
SPIFFE decouples identity from IP address. Services identify by name, not by network location.
OPA (Open Policy Agent) for Fine-Grained Authorisation
OPA evaluates policies written in Rego, a declarative language.
Install OPA:
kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/opa/main/docs/deployments/kubernetes/kube-mgmt/kube-mgmt-deployment.yaml
Write a policy:
# policies/api-service.rego
package api.authz
default allow = false
allow {
# Only worker service can call /process
input.method == "POST"
input.path == "/process"
input.caller == "spiffe://example.com/ns/default/sa/worker"
}
allow {
# Only analytics service can call /metrics
input.path == "/metrics"
input.caller == "spiffe://example.com/ns/default/sa/analytics"
}
allow {
# Admin can call anything
input.caller_role == "admin"
}
Evaluate policy in service:
const opaURL = 'http://localhost:8181/v1/data/api/authz/allow';
app.use(async (req, res, next) => {
const decision = await fetch(opaURL, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
method: req.method,
path: req.path,
caller: req.serviceID,
caller_role: req.role,
}),
}).then((r) => r.json());
if (!decision.result?.allow) {
return res.status(403).json({ error: 'Unauthorized' });
}
next();
});
Policies are version controlled, audited, and updated without redeploying services.
Network Policies in Kubernetes
NetworkPolicy restricts traffic between pods. Default deny, allow only what''s needed.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-service
namespace: default
spec:
podSelector:
matchLabels:
app: api-service
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: gateway
ports:
- protocol: TCP
port: 3000
- from:
- podSelector:
matchLabels:
app: worker
ports:
- protocol: TCP
port: 3000
egress:
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 53 # DNS
- protocol: UDP
port: 53
Only gateway and worker can call api-service. api-service can only call postgres and DNS. All other traffic is blocked.
Short-Lived Credentials with AWS IAM Roles for Service Accounts
In AWS EKS, use IRSA (IAM Roles for Service Accounts) instead of long-lived keys.
Set up IRSA:
eksctl utils associate-iam-oidc-provider --cluster=my-cluster --approve
Create IAM role:
aws iam create-role --role-name api-service-role \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT:oidc-provider/OIDC_ID"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"OIDC_ID:sub": "system:serviceaccount:default:api-service"
}
}
}
]
}'
Attach policy:
aws iam put-role-policy --role-name api-service-role \
--policy-name s3-access \
--policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-bucket/*"
}
]
}'
Link to service account:
apiVersion: v1
kind: ServiceAccount
metadata:
name: api-service
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/api-service-role
---
apiVersion: v1
kind: Pod
metadata:
name: api-service
spec:
serviceAccountName: api-service
containers:
- name: api
image: api:latest
Pod gets temporary AWS credentials automatically. No secret storage needed.
Secret Rotation Without Downtime
Use External Secrets Operator to sync secrets from a vault and rotate without pod restart.
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: vault-store
namespace: default
spec:
provider:
vault:
server: https://vault.example.com
path: secret
auth:
kubernetes:
mountPath: kubernetes
role: api-service
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: db-secret
namespace: default
spec:
secretStoreRef:
name: vault-store
target:
name: db-secret
creationPolicy: Owner
data:
- secretKey: password
remoteRef:
key: database
version: latest
refreshInterval: 1h # Rotate every hour
External Secrets Operator watches the secret. When updated, it modifies the Kubernetes secret. Pods can watch for changes and reload without restarting.
Audit Logging Every Service-to-Service Call
Log all service interactions for compliance and investigation.
app.use((req, res, next) => {
const startTime = Date.now();
res.on('finish', () => {
const duration = Date.now() - startTime;
console.log(JSON.stringify({
timestamp: new Date().toISOString(),
method: req.method,
path: req.path,
caller: req.serviceID,
status: res.statusCode,
duration,
requestSize: req.get('content-length'),
responseSize: res.get('content-length'),
}));
});
next();
});
Forward logs to a central sink (CloudWatch, DataDog, Splunk) for analysis.
Zero Trust for AI Agents
AI agents calling your APIs need scoped permissions. Issue short-lived credentials with minimal scope.
// Issue temporary token for AI agent
const token = jwt.sign(
{
agent_id: 'agent-123',
permissions: ['search', 'summarize'], // Only these actions
expires_at: Math.floor(Date.now() / 1000) + 3600, // 1 hour
},
secret,
{ algorithm: 'HS256' }
);
// AI agent uses token
app.post('/api/search', (req, res) => {
const token = req.headers.authorization?.split(' ')[1];
const claims = jwt.verify(token, secret);
if (!claims.permissions.includes('search')) {
return res.status(403).json({ error: 'Permission denied' });
}
// Proceed...
});
Checklist
- Deploy cert-manager for mTLS
- Issue certificates for all services
- Implement SPIFFE for service identity
- Deploy OPA and write authorization policies
- Implement NetworkPolicy for all pods
- Migrate to IAM Roles for Service Accounts (if on AWS)
- Set up secret rotation with External Secrets Operator
- Implement comprehensive audit logging
- Test by simulating compromised service
- Document zero trust architecture
Conclusion
Zero trust is not optional in 2026. Assume every request is untrusted. Verify identity with mTLS and SPIFFE, authorise with OPA, and log everything. Short-lived credentials and automatic rotation eliminate the risk of leaked secrets. Build zero trust into your infrastructure from the start. Retrofitting is painful.