- Published on
eBPF for Backend Engineers — Zero-Instrumentation Observability
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
eBPF (extended Berkeley Packet Filter) runs sandboxed programs in the Linux kernel. For observability, eBPF intercepts syscalls, network packets, and kernel events without instrumenting your code. This post covers eBPF concepts, Cilium for networking, Hubble for service flows, and continuous profiling.
- What eBPF Is (Kernel Programs Without Kernel Modules)
- Cilium for Kubernetes Network Observability
- Hubble for Service-to-Service Flow Visibility
- Continuous Profiling with Parca/Pyroscope
- bpftrace for Ad-Hoc Investigation
- TCP Retransmit Tracing
- Latency Profiling at Syscall Level
- eBPF vs Sidecar Overhead Comparison
- Checklist
- Conclusion
What eBPF Is (Kernel Programs Without Kernel Modules)
eBPF programs:
- Run in kernel (privileged context)
- Are sandboxed (can't crash the kernel)
- Hook into kernel events (syscalls, network packets, function calls)
- Are verified before loading (safe to run)
Unlike kernel modules, eBPF programs don't require:
- Recompiling the kernel
- Rebooting the system
- Kernel version matching
Example: Track all TCP connections without touching application code.
# List all TCP connections on the system
# No code instrumentation, no app restart
tc qdisc add dev eth0 ingress
tc filter add dev eth0 ingress bpf direct-action object-file tcptrack.o section trace_connect
Cilium for Kubernetes Network Observability
Cilium uses eBPF to replace iptables and observe all network flows in Kubernetes. It provides:
- Network policy enforcement (no iptables needed)
- Service load balancing (faster than kube-proxy)
- Network observability (every packet is visible)
# cilium-values.yaml - Helm deploy Cilium
helm repo add cilium https://helm.cilium.io
helm install cilium cilium/cilium \
--namespace kube-system \
--set ebpf.enabled=true \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true
Cilium monitors:
- Pod-to-pod traffic
- Pod-to-external traffic
- DNS queries
- HTTP requests (L7 visibility)
- TCP/UDP port scanning
Hubble for Service-to-Service Flow Visibility
Hubble (built on Cilium) visualizes service flows. Export to observability backends (Prometheus, Elasticsearch).
# service-flow.yaml - Observe traffic between services
# Install Hubble in Kubernetes
helm install hubble cilium/hubble-relay -n kube-system
# Hubble CLI: view traffic in real-time
hubble observe --pod-label app=api-server --output json
# Output:
# {
# "timestamp": "2026-03-15T10:30:00Z",
# "source": {"namespace": "default", "pod_name": "api-server-1"},
# "destination": {"namespace": "default", "pod_name": "db-postgres-1"},
# "l4": {"protocol": "tcp", "destination_port": 5432},
# "verdict": "ALLOWED",
# "bytes_sent": 4096
# }
Export to Prometheus:
# hubble-prometheus.yaml - Scrape Hubble metrics
apiVersion: v1
kind: ConfigMap
metadata:
name: hubble-prometheus-config
data:
prometheus.yml: |
scrape_configs:
- job_name: 'hubble-metrics'
static_configs:
- targets: ['localhost:6444']
relabel_configs:
- source_labels: [__address__]
target_label: instance
Continuous Profiling with Parca/Pyroscope
Always-on profiling captures CPU usage at function granularity. eBPF enables zero-instrumentation profiling.
# parca-deployment.yaml - Deploy Parca for continuous profiling
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: parca-agent
spec:
selector:
matchLabels:
app: parca-agent
template:
metadata:
labels:
app: parca-agent
spec:
hostNetwork: true
hostPID: true
containers:
- name: parca-agent
image: ghcr.io/parca-dev/parca-agent:latest
args:
- "--node=$(HOSTNAME)"
- "--parca-address=parca:7070"
- "--enable-cpu-profiling"
- "--enable-memory-profiling"
env:
- name: HOSTNAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
privileged: true
Query Parca for bottlenecks:
# Find top CPU consumers
parca-cli query select '{cpu_profile}' --start='2h ago'
# Find memory leaks
parca-cli query select '{memory_profile}' --start='24h ago'
# Compare profiles (before/after optimization)
parca-cli diff \
'select {cpu_profile}' \
'select {cpu_profile}' \
--start='2026-03-14T00:00:00Z' \
--end='2026-03-15T00:00:00Z'
bpftrace for Ad-Hoc Investigation
bpftrace is a high-level tracing language. Write one-liners to investigate system behavior.
# Trace all file opens by nginx
bpftrace -e 'tracepoint:syscalls:sys_enter_open* /comm == "nginx"/ { printf("%s %s\n", comm, str(args->filename)); }'
# Trace slow syscalls (>10ms)
bpftrace -e '
tracepoint:raw_syscalls:sys_enter { @start[tid] = nsecs; }
tracepoint:raw_syscalls:sys_exit { @latency = hist(nsecs - @start[tid]); delete(@start[tid]); }
'
# Find memory allocations by size
bpftrace -e '
kprobe:kmalloc { @allocs[args->size] = count(); }
END { print(@allocs); }
'
# Trace HTTP requests to a specific backend
bpftrace -e '
kprobe:tcp_cleanup_rbuf { printf("Socket %p closed\n", args->sk); }
'
TCP Retransmit Tracing
High TCP retransmits indicate network problems. Use eBPF to detect and locate them.
# Monitor TCP retransmits in real-time
bpftrace -e '
kprobe:tcp_retransmit_skb {
printf("Retransmit: %s -> %s\n",
ntop(AF_INET, args->skb->__skb_basic_meta->dev_net->ns->inum),
ntop(AF_INET, args->skb->__skb_basic_meta->transport_header)
);
}
'
# Export to Prometheus
# (bpftrace output → custom exporter → Prometheus)
Latency Profiling at Syscall Level
Identify bottlenecks by measuring time spent in syscalls.
# Syscall latencies
bpftrace -e '
tracepoint:raw_syscalls:sys_enter {
@syscall_start[tid] = nsecs;
}
tracepoint:raw_syscalls:sys_exit {
if (@syscall_start[tid]) {
@syscalls[strjoin(", ", @args)] = hist(nsecs - @syscall_start[tid]);
}
}
'
# Output:
# @syscalls[sys_epoll_wait]:
# [0..1ms): 10234 |@@@@@@@@@@@|
# [1..10ms): 5432 |@@@@@ |
# [10..100ms): 123 |@ |
# [100ms+): 5 | |
eBPF vs Sidecar Overhead Comparison
| Approach | CPU Overhead | Memory | Latency Impact | Deployment |
|---|---|---|---|---|
| eBPF | 2-5% | 100MB | < 1us | Kernel module |
| Sidecar (Envoy) | 10-20% | 1GB+ | 5-10us | Container per pod |
| Code instrumentation | 5-15% | 200MB | 2-5us | Recompile app |
| eBPF + Sidecar | 15-25% | 1.1GB+ | 10us+ | Both |
eBPF wins on efficiency; sidecars win on control.
Checklist
- Deploy Cilium for network observability in Kubernetes
- Use Hubble to visualize service flows
- Set up Parca for always-on CPU/memory profiling
- Write bpftrace one-liners for quick investigations
- Monitor TCP retransmits as a network health indicator
- Profile syscall latency to find bottlenecks
- Compare eBPF vs instrumentation (eBPF usually cheaper)
- Test eBPF programs in staging before production
- Monitor Cilium CPU overhead
- Export Hubble/Parca data to long-term storage
Conclusion
eBPF provides observability without instrumenting code. Cilium and Hubble visualize network flows. Parca profiles CPU/memory continuously. For diagnosing production issues, bpftrace is unmatched. Start with Cilium + Hubble for network visibility; add Parca for continuous profiling; use bpftrace for investigations.