eBPF lets you observe and change network behavior right inside the Linux kernel — with no sidecars in every pod and no kernel rebuild. For Kubernetes that means a different model: instead of thousands of iptables rules and proxy containers next to the app, all the networking and observability logic lives in a single layer on the node. Cilium is the most mature implementation of this approach. Let’s cover what eBPF is in plain terms, why you’d swap your CNI, and what you need to try it all in a test cluster.
Table of contents
Open Table of contents
What eBPF is in plain terms
To intervene in networking or syscalls at the kernel level you used to write a kernel module — dangerous (a bug takes down the whole node) and inconvenient (rebuild, reboot). eBPF changes the rules: you load a tiny program into the kernel and it runs in response to events — a packet arrives, a socket opens, a syscall fires.
The key piece is the verifier. Before loading, the kernel checks the program: that it terminates (no infinite loops), won’t touch memory it shouldn’t, won’t crash the system. Only after passing does the program attach to a hook point. The result is a safe “live kernel extension”: native-code performance without the risks of a module and without recompilation.
For networking that means routing, load balancing, filtering, and metrics collection can happen at the hottest spot — where the packet enters the node — instead of shuttling it through long rule chains in user space and back.
The problem eBPF solves
In classic Kubernetes two things handle networking, and both scale poorly.
The first is kube-proxy on iptables. Each Service turns into a set of iptables rules, and the kernel walks them linearly. With dozens of services it’s invisible. With thousands the rule chain gets long and updating it on every endpoint change is expensive. Latency and control-plane load grow with cluster size.
The second is observability via sidecars. A classic service mesh puts a proxy container next to every app: it intercepts traffic and collects metrics. It works, but the price is high — an extra container per pod, additional CPU and memory, added latency per hop, and overall complexity. For hundreds of pods the “sidecar tax” becomes a noticeable line item.
eBPF removes both at once: instead of linear rules, a hash table in the kernel with constant-time lookup; instead of a proxy in every pod, a single dataplane layer per node.
What Cilium changes
Cilium is a CNI (Kubernetes network plugin) built on eBPF. Installing it instead of flannel or calico gets you several things:
- kube-proxy replacement. Cilium can drop kube-proxy entirely: service load balancing is done via eBPF maps. Lower latency, better scaling on large clusters.
- Identity-based policies. Plain NetworkPolicy operates on IPs. In a dynamic cluster pod IPs change constantly, so IP-based rules are fragile. Cilium assigns each workload a stable identity from its labels and filters on that — a “frontend may talk to backend” policy survives any pod reshuffle.
- L7 policies. You can filter not only by port but by HTTP methods and paths, gRPC methods, Kafka topics — without a full sidecar mesh.
- Hubble. A built-in observability layer: who talks to whom, what’s blocked and why — with no proxy in any pod.
Hubble: observability without a proxy
Hubble is Cilium’s eyes. Since all traffic already passes through the eBPF layer, Hubble simply reads those events and surfaces them as flows. You see a service graph, individual connections, and — most useful when debugging — drop reasons.
The typical scenario: a service “can’t reach” another and it’s unclear why. Instead of tcpdump across pods and reading iptables, you run one command and immediately see packets dropped with verdict DROPPED and reason Policy denied. The network didn’t break — your own NetworkPolicy is cutting the traffic. That debugging takes seconds instead of hours.
The important part is observability without instrumenting the application. You don’t embed an SDK, add a proxy, or change the service’s code — eBPF sees the traffic at the kernel level as it actually flows. So Hubble shows your own Go service, a closed-source binary, and a legacy app no one has touched in years equally well. Flow metrics (request counts, drop ratios, per-connection latency) are exported to Prometheus, so you can build ordinary dashboards and alerts on top — without a sidecar under every pod.
eBPF is not only about networking
While in the Cilium context eBPF is usually discussed as a network dataplane, the approach is broader. The same in-kernel programs can observe syscalls, process execution, file opens, and network connections at the per-process level. Tetragon — a component of the Cilium ecosystem for security observability and runtime enforcement — is built on this.
In practice this gives you a runtime “black box”: you see an unexpected process start inside a container, an app reading a file it shouldn’t touch, or a connection to a suspicious address — all with no agent inside the container. And you can not only observe but block: a kernel-level rule kills a process that violates policy before it can do harm. For a team that means one technology covers both network observability and basic runtime security — without a zoo of separate agents.
Comparison at a glance
| kube-proxy + sidecars | Cilium (eBPF) | |
|---|---|---|
| Service routing | iptables, O(N) | eBPF map, ~O(1) |
| Observability | proxy in every pod | Hubble, per-node layer |
| Policy basis | IP addresses | label identity |
| Filtering level | L3/L4 | L3/L4 and L7 |
| Per-pod tax | +container, CPU/RAM | no sidecar |
| Debugging drops | tcpdump + iptables | hubble observe |
What you need to stand up Cilium
The easiest way to try it is a local kind cluster. You’ll need kind, helm, and the cilium CLI. Create a cluster without the default CNI so you can install your own:
kind create cluster --config - <<'EOF'
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
disableDefaultCNI: true # disable kindnet, install Cilium
kubeProxyMode: none # Cilium will replace kube-proxy
nodes:
- role: control-plane
- role: worker
EOF
Install Cilium with Hubble enabled:
helm repo add cilium https://helm.cilium.io
helm install cilium cilium/cilium --namespace kube-system \
--set kubeProxyReplacement=true \
--set hubble.enabled=true \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true
cilium status --wait # wait until ready
Policy: deny egress except one host
Now lock down a pod’s egress, leaving access only to a needed external API. Cilium supports L7 rules and DNS-name filtering:
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: egress-only-api
namespace: default
spec:
endpointSelector:
matchLabels:
app: worker
egress:
- toFQDNs:
- matchName: "api.example.com"
toPorts:
- ports:
- port: "443"
protocol: TCP
- toEndpoints: # allow DNS, or resolution fails
- matchLabels:
k8s:io.kubernetes.pod.namespace: kube-system
k8s-app: kube-dns
toPorts:
- ports:
- port: "53"
protocol: UDP
A pod labeled app: worker can now reach only api.example.com:443 (and DNS). Any other egress is dropped — and it shows up immediately in Hubble.
How to verify it works
First confirm kube-proxy is really replaced:
cilium status | grep -i kubeproxy
# KubeProxyReplacement: True
Now watch live flows and look for drops from our policy:
hubble observe --namespace default --verdict DROPPED
# ... worker -> 1.2.3.4:443 DROPPED (Policy denied)
Try reaching any address other than api.example.com from the worker pod and a line with verdict DROPPED and a reason appears — that’s “packet tracing without sidecars.” And hubble observe --verdict FORWARDED shows the allowed traffic. For a visual picture there’s the Hubble UI with a service graph (cilium hubble ui).
Pitfalls worth knowing up front
Cilium is powerful but not “install and forget.” A few places where newcomers trip up.
- Replacing kube-proxy needs a compatible kernel. eBPF features depend on the node’s kernel version. On older distros some features (full kube-proxy replacement, certain L7 capabilities) may be unavailable — check kernel requirements before prod.
- Host networking and host processes. Pods with
hostNetwork: truelive in the node’s network namespace and are covered by Cilium policies differently. A common source of “the policy exists but traffic still flows.” - DNS in egress policies. If you lock down egress, don’t forget to explicitly allow DNS to
kube-dns, or the app can’t even resolve a name and hits a confusing timeout instead of a clearPolicy denied. toFQDNsis not magic. Domain-name filtering works by intercepting DNS responses. If an app talks to a raw IP bypassing DNS, a name-based rule won’t catch it.- Debugging “it died in eBPF.” When something fails at the dataplane level, start with
hubble observeandcilium monitor— they show kernel verdicts. You almost never need to dig into the bytecode itself.
Bottom line
eBPF removes the two big taxes of classic Kubernetes networking: linear iptables rules and a proxy container in every pod. Cilium packages this into a ready CNI — with kube-proxy replacement, identity policies, L7 filtering, and Hubble observability, where packet tracing and the drop reason are one command away. The cost of entry is swapping the CNI and paying attention to kernel version and host networking. If your cluster already has more than five to ten services and network debugging regularly eats hours, the switch almost always pays off: you gain eyes on the network that, in the sidecar world, would have cost a noticeable overhead.