Cilium Service Mesh: Sidecar-Free with eBPF

Traditional service meshes inject sidecar proxies into every pod. Cilium does it differently - eBPF programs in the kernel handle mTLS, load balancing, and observability with zero sidecars.

This guide covers deploying Cilium service mesh and configuring traffic management, security policies, and observability.

TL;DR

Cilium mesh = eBPF-powered service mesh, no sidecars
Per-node Envoy for L7 processing (not per-pod)
Native mTLS with SPIFFE identities
Hubble for observability
50% less resource overhead vs sidecar mesh

Why Sidecar-Free?

SIDECAR MESH (Istio/Linkerd)        CILIUM MESH
========================            ===========
Pod 1: App + Envoy (150MB)          Pod 1: App only
Pod 2: App + Envoy (150MB)          Pod 2: App only  
Pod 3: App + Envoy (150MB)          Pod 3: App only
                                    Node: Cilium Agent + Envoy
                                    
Memory: 450MB sidecars              Memory: ~200MB per node
Latency: 2 proxy hops               Latency: kernel-level

Install Cilium with Service Mesh

# Install Cilium CLI
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-amd64.tar.gz
sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin

# Install with service mesh features
cilium install \
  --version 1.15.0 \
  --set kubeProxyReplacement=true \
  --set ingressController.enabled=true \
  --set ingressController.loadbalancerMode=shared

# Enable Hubble
cilium hubble enable --ui

# Verify
cilium status

Helm Installation

# cilium-values.yaml
kubeProxyReplacement: true

# Ingress controller
ingressController:
  enabled: true
  loadbalancerMode: shared

# L7 proxy (Envoy per node)
envoy:
  enabled: true

# mTLS
encryption:
  enabled: true
  type: wireguard  # or ipsec

# Hubble observability  
hubble:
  enabled: true
  relay:
    enabled: true
  ui:
    enabled: true
  metrics:
    enabled:
      - dns
      - drop
      - tcp
      - flow
      - port-distribution
      - httpV2:exemplars=true;labelsContext=source_ip,source_namespace,destination_ip,destination_namespace

# Gateway API (future of ingress)
gatewayAPI:
  enabled: true

helm repo add cilium https://helm.cilium.io
helm upgrade --install cilium cilium/cilium \
  --namespace kube-system \
  -f cilium-values.yaml

mTLS Encryption

Cilium provides transparent mTLS between pods:

# Enable WireGuard encryption (recommended)
cilium config set encryption-type wireguard

# Or IPsec
cilium config set encryption-type ipsec
cilium config set encryption-ipsec-key-file /path/to/key

Verify encryption:

# Check encryption status
cilium encrypt status

# See encrypted flows
hubble observe --protocol encrypted

Traffic Management

L7 Traffic Policies

# Retry and timeout policies
apiVersion: cilium.io/v2
kind: CiliumEnvoyConfig
metadata:
  name: api-server-config
  namespace: production
spec:
  services:
    - name: api-server
      namespace: production
  resources:
    - "@type": type.googleapis.com/envoy.config.route.v3.RouteConfiguration
      name: api-server-route
      virtual_hosts:
        - name: api-server
          domains: ["*"]
          routes:
            - match:
                prefix: "/"
              route:
                cluster: "production/api-server"
                timeout: 30s
                retry_policy:
                  retry_on: "5xx,reset,connect-failure"
                  num_retries: 3
                  per_try_timeout: 10s

Canary Deployments

apiVersion: cilium.io/v2
kind: CiliumEnvoyConfig
metadata:
  name: canary-routing
  namespace: production
spec:
  services:
    - name: api-server
      namespace: production
    - name: api-server-canary
      namespace: production
  resources:
    - "@type": type.googleapis.com/envoy.config.route.v3.RouteConfiguration
      name: canary-route
      virtual_hosts:
        - name: api
          domains: ["*"]
          routes:
            - match:
                prefix: "/"
                headers:
                  - name: "x-canary"
                    exact_match: "true"
              route:
                cluster: "production/api-server-canary"
            - match:
                prefix: "/"
              route:
                weighted_clusters:
                  clusters:
                    - name: "production/api-server"
                      weight: 90
                    - name: "production/api-server-canary"
                      weight: 10

Rate Limiting

apiVersion: cilium.io/v2
kind: CiliumEnvoyConfig
metadata:
  name: rate-limit
  namespace: production
spec:
  services:
    - name: api-server
      namespace: production
  resources:
    - "@type": type.googleapis.com/envoy.config.filter.http.local_ratelimit.v3.LocalRateLimit
      stat_prefix: http_local_rate_limiter
      token_bucket:
        max_tokens: 100
        tokens_per_fill: 100
        fill_interval: 1s
      filter_enabled:
        runtime_key: local_rate_limit_enabled
        default_value:
          numerator: 100
          denominator: HUNDRED

Ingress with Cilium

Cilium can replace your ingress controller:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  annotations:
    ingress.cilium.io/loadbalancer-mode: shared
    ingress.cilium.io/tls-passthrough: "false"
spec:
  ingressClassName: cilium
  tls:
    - hosts:
        - api.company.com
      secretName: api-tls
  rules:
    - host: api.company.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: api-server
                port:
                  number: 8080

Gateway API (Recommended)

# Gateway class (created by Cilium)
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: production-gateway
  namespace: production
spec:
  gatewayClassName: cilium
  listeners:
    - name: https
      port: 443
      protocol: HTTPS
      hostname: "*.company.com"
      tls:
        mode: Terminate
        certificateRefs:
          - name: wildcard-tls
    - name: http
      port: 80
      protocol: HTTP
      hostname: "*.company.com"
      allowedRoutes:
        kinds:
          - kind: HTTPRoute

---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: api-route
  namespace: production
spec:
  parentRefs:
    - name: production-gateway
  hostnames:
    - "api.company.com"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /v1
      backendRefs:
        - name: api-v1
          port: 8080
          weight: 100
    - matches:
        - path:
            type: PathPrefix
            value: /v2
      backendRefs:
        - name: api-v2
          port: 8080
          weight: 90
        - name: api-v2-canary
          port: 8080
          weight: 10

Observability with Hubble

Hubble provides deep network observability:

# Enable Hubble UI
cilium hubble enable --ui
kubectl port-forward -n kube-system svc/hubble-ui 12000:80

# CLI observability
hubble observe --namespace production

# Filter by verdict
hubble observe --verdict DROPPED

# Filter by HTTP
hubble observe --protocol http --http-status 500

# Export to JSON
hubble observe --output json > flows.json

Prometheus Metrics

# ServiceMonitor for Cilium
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: cilium
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: cilium-agent
  namespaceSelector:
    matchNames:
      - kube-system
  endpoints:
    - port: prometheus
      interval: 15s

Key Metrics

METRIC                              DESCRIPTION
======                              ===========
cilium_forward_count_total          Packets forwarded
cilium_drop_count_total             Packets dropped (with reason)
hubble_flows_processed_total        L7 flows observed
cilium_policy_verdict_total         Policy decisions
cilium_http_request_duration        HTTP latency

Network Policies (L7)

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: api-l7-policy
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: api-server
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: frontend
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP
          rules:
            http:
              - method: GET
                path: "/api/v1/public/.*"
              - method: POST
                path: "/api/v1/public/.*"
                headers:
                  - "Content-Type: application/json"
    
    - fromEndpoints:
        - matchLabels:
            app: admin
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP
          rules:
            http:
              - method: ".*"
                path: "/api/.*"

Migration from Istio

Install Cilium alongside Istio
Migrate namespace by namespace
Remove Istio sidecars
Remove Istio control plane

# Label namespace to disable Istio injection
kubectl label namespace production istio-injection=disabled

# Restart pods to remove sidecars
kubectl rollout restart deployment -n production

# Verify Cilium is handling traffic
hubble observe --namespace production

Troubleshooting

Pods can’t communicate:

cilium connectivity test
cilium status --verbose

L7 policies not working:

# Check Envoy is running
kubectl get pods -n kube-system -l k8s-app=cilium-envoy

# Check policy status
cilium policy get

High latency:

# Check for drops
hubble observe --verdict DROPPED

# Check Envoy metrics
curl -s localhost:9901/stats | grep latency

References

Cilium Docs: https://docs.cilium.io
Service Mesh: https://docs.cilium.io/en/stable/network/servicemesh/
Hubble: https://docs.cilium.io/en/stable/observability/hubble/
Gateway API: https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/

======================================== Cilium + eBPF + Service Mesh

No sidecars. Kernel-level networking.