Cilium Service Mesh: Sidecar-Free with eBPF
Traditional service meshes inject sidecar proxies into every pod. Cilium does it differently - eBPF programs in the kernel handle mTLS, load balancing, and observability with zero sidecars.
This guide covers deploying Cilium service mesh and configuring traffic management, security policies, and observability.
TL;DR
- Cilium mesh = eBPF-powered service mesh, no sidecars
- Per-node Envoy for L7 processing (not per-pod)
- Native mTLS with SPIFFE identities
- Hubble for observability
- 50% less resource overhead vs sidecar mesh
Why Sidecar-Free?
SIDECAR MESH (Istio/Linkerd) CILIUM MESH
======================== ===========
Pod 1: App + Envoy (150MB) Pod 1: App only
Pod 2: App + Envoy (150MB) Pod 2: App only
Pod 3: App + Envoy (150MB) Pod 3: App only
Node: Cilium Agent + Envoy
Memory: 450MB sidecars Memory: ~200MB per node
Latency: 2 proxy hops Latency: kernel-level
Install Cilium with Service Mesh
# Install Cilium CLI
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-amd64.tar.gz
sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin
# Install with service mesh features
cilium install \
--version 1.15.0 \
--set kubeProxyReplacement=true \
--set ingressController.enabled=true \
--set ingressController.loadbalancerMode=shared
# Enable Hubble
cilium hubble enable --ui
# Verify
cilium status
Helm Installation
# cilium-values.yaml
kubeProxyReplacement: true
# Ingress controller
ingressController:
enabled: true
loadbalancerMode: shared
# L7 proxy (Envoy per node)
envoy:
enabled: true
# mTLS
encryption:
enabled: true
type: wireguard # or ipsec
# Hubble observability
hubble:
enabled: true
relay:
enabled: true
ui:
enabled: true
metrics:
enabled:
- dns
- drop
- tcp
- flow
- port-distribution
- httpV2:exemplars=true;labelsContext=source_ip,source_namespace,destination_ip,destination_namespace
# Gateway API (future of ingress)
gatewayAPI:
enabled: true
helm repo add cilium https://helm.cilium.io
helm upgrade --install cilium cilium/cilium \
--namespace kube-system \
-f cilium-values.yaml
mTLS Encryption
Cilium provides transparent mTLS between pods:
# Enable WireGuard encryption (recommended)
cilium config set encryption-type wireguard
# Or IPsec
cilium config set encryption-type ipsec
cilium config set encryption-ipsec-key-file /path/to/key
Verify encryption:
# Check encryption status
cilium encrypt status
# See encrypted flows
hubble observe --protocol encrypted
Traffic Management
L7 Traffic Policies
# Retry and timeout policies
apiVersion: cilium.io/v2
kind: CiliumEnvoyConfig
metadata:
name: api-server-config
namespace: production
spec:
services:
- name: api-server
namespace: production
resources:
- "@type": type.googleapis.com/envoy.config.route.v3.RouteConfiguration
name: api-server-route
virtual_hosts:
- name: api-server
domains: ["*"]
routes:
- match:
prefix: "/"
route:
cluster: "production/api-server"
timeout: 30s
retry_policy:
retry_on: "5xx,reset,connect-failure"
num_retries: 3
per_try_timeout: 10s
Canary Deployments
apiVersion: cilium.io/v2
kind: CiliumEnvoyConfig
metadata:
name: canary-routing
namespace: production
spec:
services:
- name: api-server
namespace: production
- name: api-server-canary
namespace: production
resources:
- "@type": type.googleapis.com/envoy.config.route.v3.RouteConfiguration
name: canary-route
virtual_hosts:
- name: api
domains: ["*"]
routes:
- match:
prefix: "/"
headers:
- name: "x-canary"
exact_match: "true"
route:
cluster: "production/api-server-canary"
- match:
prefix: "/"
route:
weighted_clusters:
clusters:
- name: "production/api-server"
weight: 90
- name: "production/api-server-canary"
weight: 10
Rate Limiting
apiVersion: cilium.io/v2
kind: CiliumEnvoyConfig
metadata:
name: rate-limit
namespace: production
spec:
services:
- name: api-server
namespace: production
resources:
- "@type": type.googleapis.com/envoy.config.filter.http.local_ratelimit.v3.LocalRateLimit
stat_prefix: http_local_rate_limiter
token_bucket:
max_tokens: 100
tokens_per_fill: 100
fill_interval: 1s
filter_enabled:
runtime_key: local_rate_limit_enabled
default_value:
numerator: 100
denominator: HUNDRED
Ingress with Cilium
Cilium can replace your ingress controller:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
annotations:
ingress.cilium.io/loadbalancer-mode: shared
ingress.cilium.io/tls-passthrough: "false"
spec:
ingressClassName: cilium
tls:
- hosts:
- api.company.com
secretName: api-tls
rules:
- host: api.company.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-server
port:
number: 8080
Gateway API (Recommended)
# Gateway class (created by Cilium)
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: production-gateway
namespace: production
spec:
gatewayClassName: cilium
listeners:
- name: https
port: 443
protocol: HTTPS
hostname: "*.company.com"
tls:
mode: Terminate
certificateRefs:
- name: wildcard-tls
- name: http
port: 80
protocol: HTTP
hostname: "*.company.com"
allowedRoutes:
kinds:
- kind: HTTPRoute
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: api-route
namespace: production
spec:
parentRefs:
- name: production-gateway
hostnames:
- "api.company.com"
rules:
- matches:
- path:
type: PathPrefix
value: /v1
backendRefs:
- name: api-v1
port: 8080
weight: 100
- matches:
- path:
type: PathPrefix
value: /v2
backendRefs:
- name: api-v2
port: 8080
weight: 90
- name: api-v2-canary
port: 8080
weight: 10
Observability with Hubble
Hubble provides deep network observability:
# Enable Hubble UI
cilium hubble enable --ui
kubectl port-forward -n kube-system svc/hubble-ui 12000:80
# CLI observability
hubble observe --namespace production
# Filter by verdict
hubble observe --verdict DROPPED
# Filter by HTTP
hubble observe --protocol http --http-status 500
# Export to JSON
hubble observe --output json > flows.json
Prometheus Metrics
# ServiceMonitor for Cilium
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: cilium
namespace: monitoring
spec:
selector:
matchLabels:
app.kubernetes.io/name: cilium-agent
namespaceSelector:
matchNames:
- kube-system
endpoints:
- port: prometheus
interval: 15s
Key Metrics
METRIC DESCRIPTION
====== ===========
cilium_forward_count_total Packets forwarded
cilium_drop_count_total Packets dropped (with reason)
hubble_flows_processed_total L7 flows observed
cilium_policy_verdict_total Policy decisions
cilium_http_request_duration HTTP latency
Network Policies (L7)
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: api-l7-policy
namespace: production
spec:
endpointSelector:
matchLabels:
app: api-server
ingress:
- fromEndpoints:
- matchLabels:
app: frontend
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: GET
path: "/api/v1/public/.*"
- method: POST
path: "/api/v1/public/.*"
headers:
- "Content-Type: application/json"
- fromEndpoints:
- matchLabels:
app: admin
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: ".*"
path: "/api/.*"
Migration from Istio
- Install Cilium alongside Istio
- Migrate namespace by namespace
- Remove Istio sidecars
- Remove Istio control plane
# Label namespace to disable Istio injection
kubectl label namespace production istio-injection=disabled
# Restart pods to remove sidecars
kubectl rollout restart deployment -n production
# Verify Cilium is handling traffic
hubble observe --namespace production
Troubleshooting
Pods can’t communicate:
cilium connectivity test
cilium status --verbose
L7 policies not working:
# Check Envoy is running
kubectl get pods -n kube-system -l k8s-app=cilium-envoy
# Check policy status
cilium policy get
High latency:
# Check for drops
hubble observe --verdict DROPPED
# Check Envoy metrics
curl -s localhost:9901/stats | grep latency
References
- Cilium Docs: https://docs.cilium.io
- Service Mesh: https://docs.cilium.io/en/stable/network/servicemesh/
- Hubble: https://docs.cilium.io/en/stable/observability/hubble/
- Gateway API: https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/