Published on

Zero Trust Architecture for Backend Systems — Never Trust, Always Verify

Authors

Introduction

Zero trust is the security model for 2026. The old perimeter-based approach assumed network was safe. Modern systems are distributed, multi-cloud, and exposed to the internet. Zero trust assumes every request is untrusted. Every service-to-service call must be authenticated and authorised.

This post covers implementing zero trust for microservices: mutual TLS (mTLS), service identities, OPA policies, and short-lived credentials.

Zero Trust Principles for Microservices

  1. Never trust, always verify: Every request requires authentication, even on internal networks
  2. Least privilege: Services get only the permissions they need
  3. Inspect and log: Every interaction is logged and monitored
  4. Assume breach: Design for compromise. If one service is hacked, limit lateral movement

Traditional (perimeter-based):

User -> [Firewall] -> Internal Network -> Service A -> Service B
(Assume everything inside firewall is safe)

Zero trust:

User -[verify]-> Service A -[mTLS+policy]-> Service B
(Verify every hop, every request)

mTLS Between Services with cert-manager

Mutual TLS means both client and server verify each other with certificates. cert-manager automates certificate issuance and renewal.

Install cert-manager:

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml

Create a self-signed CA:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: internal-ca
  namespace: cert-manager
spec:
  secretName: internal-ca-key-pair
  commonName: internal-ca
  isCA: true

Create issuer:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: internal-ca
spec:
  ca:
    secretRef:
      name: internal-ca-key-pair

Issue service certificates:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: api-service-cert
  namespace: default
spec:
  secretName: api-service-tls
  issuerRef:
    name: internal-ca
    kind: ClusterIssuer
  dnsNames:
    - api.default.svc.cluster.local

Use mTLS in service pods:

// api-service/src/server.ts
import fs from 'fs';
import https from 'https';

const key = fs.readFileSync('/etc/tls/private/tls.key');
const cert = fs.readFileSync('/etc/tls/private/tls.crt');
const ca = fs.readFileSync('/etc/tls/private/ca.crt');

const options = {
  key,
  cert,
  ca,
  requestCert: true,
  rejectUnauthorized: true,
};

https.createServer(options, app).listen(3000);

Pod manifest:

apiVersion: v1
kind: Pod
metadata:
  name: api-service
spec:
  containers:
    - name: api
      image: api:latest
      volumeMounts:
        - name: tls
          mountPath: /etc/tls/private
          readOnly: true
  volumes:
    - name: tls
      secret:
        secretName: api-service-tls

cert-manager rotates certificates automatically. No manual renewal.

SPIFFE/SPIRE for Service Identity

SPIFFE is a standard for issuing and managing service identities. SPIRE is the production implementation.

Install SPIRE:

helm install spire spiffe/spire -n spire --create-namespace

Service attestation:

apiVersion: spire.spiffe.io/v1alpha1
kind: ClusterStaticEntry
metadata:
  name: api-service
  namespace: default
spec:
  spiffeID: spiffe://example.com/ns/default/sa/api
  parentID: spiffe://example.com/spire/agent
  federatesWith: []

Client requests SPIFFE credential:

const credential = await spire.requestSVID({
  spiffeID: 'spiffe://example.com/ns/default/sa/api',
});

const response = await fetch('https://worker.svc.cluster.local/process', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${credential.jwt}`,
  },
  body: JSON.stringify(data),
});

Server verifies SPIFFE JWT:

import { verifyJWT } from 'spiffe';

app.use(async (req, res, next) => {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({});

  try {
    const claims = await verifyJWT(token, publicKey);
    req.serviceID = claims.sub; // spiffe://example.com/ns/default/sa/api
    next();
  } catch {
    res.status(401).json({});
  }
});

SPIFFE decouples identity from IP address. Services identify by name, not by network location.

OPA (Open Policy Agent) for Fine-Grained Authorisation

OPA evaluates policies written in Rego, a declarative language.

Install OPA:

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/opa/main/docs/deployments/kubernetes/kube-mgmt/kube-mgmt-deployment.yaml

Write a policy:

# policies/api-service.rego
package api.authz

default allow = false

allow {
  # Only worker service can call /process
  input.method == "POST"
  input.path == "/process"
  input.caller == "spiffe://example.com/ns/default/sa/worker"
}

allow {
  # Only analytics service can call /metrics
  input.path == "/metrics"
  input.caller == "spiffe://example.com/ns/default/sa/analytics"
}

allow {
  # Admin can call anything
  input.caller_role == "admin"
}

Evaluate policy in service:

const opaURL = 'http://localhost:8181/v1/data/api/authz/allow';

app.use(async (req, res, next) => {
  const decision = await fetch(opaURL, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      method: req.method,
      path: req.path,
      caller: req.serviceID,
      caller_role: req.role,
    }),
  }).then((r) => r.json());

  if (!decision.result?.allow) {
    return res.status(403).json({ error: 'Unauthorized' });
  }

  next();
});

Policies are version controlled, audited, and updated without redeploying services.

Network Policies in Kubernetes

NetworkPolicy restricts traffic between pods. Default deny, allow only what''s needed.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-service
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: api-service
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: gateway
      ports:
        - protocol: TCP
          port: 3000
    - from:
        - podSelector:
            matchLabels:
              app: worker
      ports:
        - protocol: TCP
          port: 3000
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: postgres
      ports:
        - protocol: TCP
          port: 5432
    - to:
        - namespaceSelector: {}
      ports:
        - protocol: TCP
          port: 53 # DNS
        - protocol: UDP
          port: 53

Only gateway and worker can call api-service. api-service can only call postgres and DNS. All other traffic is blocked.

Short-Lived Credentials with AWS IAM Roles for Service Accounts

In AWS EKS, use IRSA (IAM Roles for Service Accounts) instead of long-lived keys.

Set up IRSA:

eksctl utils associate-iam-oidc-provider --cluster=my-cluster --approve

Create IAM role:

aws iam create-role --role-name api-service-role \
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": {
          "Federated": "arn:aws:iam::ACCOUNT:oidc-provider/OIDC_ID"
        },
        "Action": "sts:AssumeRoleWithWebIdentity",
        "Condition": {
          "StringEquals": {
            "OIDC_ID:sub": "system:serviceaccount:default:api-service"
          }
        }
      }
    ]
  }'

Attach policy:

aws iam put-role-policy --role-name api-service-role \
  --policy-name s3-access \
  --policy-document '{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": "s3:GetObject",
        "Resource": "arn:aws:s3:::my-bucket/*"
      }
    ]
  }'

Link to service account:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: api-service
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/api-service-role
---
apiVersion: v1
kind: Pod
metadata:
  name: api-service
spec:
  serviceAccountName: api-service
  containers:
    - name: api
      image: api:latest

Pod gets temporary AWS credentials automatically. No secret storage needed.

Secret Rotation Without Downtime

Use External Secrets Operator to sync secrets from a vault and rotate without pod restart.

apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: vault-store
  namespace: default
spec:
  provider:
    vault:
      server: https://vault.example.com
      path: secret
      auth:
        kubernetes:
          mountPath: kubernetes
          role: api-service
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-secret
  namespace: default
spec:
  secretStoreRef:
    name: vault-store
  target:
    name: db-secret
    creationPolicy: Owner
  data:
    - secretKey: password
      remoteRef:
        key: database
        version: latest
  refreshInterval: 1h # Rotate every hour

External Secrets Operator watches the secret. When updated, it modifies the Kubernetes secret. Pods can watch for changes and reload without restarting.

Audit Logging Every Service-to-Service Call

Log all service interactions for compliance and investigation.

app.use((req, res, next) => {
  const startTime = Date.now();

  res.on('finish', () => {
    const duration = Date.now() - startTime;

    console.log(JSON.stringify({
      timestamp: new Date().toISOString(),
      method: req.method,
      path: req.path,
      caller: req.serviceID,
      status: res.statusCode,
      duration,
      requestSize: req.get('content-length'),
      responseSize: res.get('content-length'),
    }));
  });

  next();
});

Forward logs to a central sink (CloudWatch, DataDog, Splunk) for analysis.

Zero Trust for AI Agents

AI agents calling your APIs need scoped permissions. Issue short-lived credentials with minimal scope.

// Issue temporary token for AI agent
const token = jwt.sign(
  {
    agent_id: 'agent-123',
    permissions: ['search', 'summarize'], // Only these actions
    expires_at: Math.floor(Date.now() / 1000) + 3600, // 1 hour
  },
  secret,
  { algorithm: 'HS256' }
);

// AI agent uses token
app.post('/api/search', (req, res) => {
  const token = req.headers.authorization?.split(' ')[1];
  const claims = jwt.verify(token, secret);

  if (!claims.permissions.includes('search')) {
    return res.status(403).json({ error: 'Permission denied' });
  }

  // Proceed...
});

Checklist

  • Deploy cert-manager for mTLS
  • Issue certificates for all services
  • Implement SPIFFE for service identity
  • Deploy OPA and write authorization policies
  • Implement NetworkPolicy for all pods
  • Migrate to IAM Roles for Service Accounts (if on AWS)
  • Set up secret rotation with External Secrets Operator
  • Implement comprehensive audit logging
  • Test by simulating compromised service
  • Document zero trust architecture

Conclusion

Zero trust is not optional in 2026. Assume every request is untrusted. Verify identity with mTLS and SPIFFE, authorise with OPA, and log everything. Short-lived credentials and automatic rotation eliminate the risk of leaked secrets. Build zero trust into your infrastructure from the start. Retrofitting is painful.