Your app is slow. Not CPU slow. Not memory slow. DNS slow.
You’ve deployed to Kubernetes, everything works, but external API calls that should take 50ms are taking 5-15 seconds. The culprit? A tiny setting called ndots:5 that’s been silently multiplying your DNS queries.
The Problem
By default, Kubernetes sets ndots:5 in every pod’s /etc/resolv.conf. This innocent-looking setting has massive performance implications.
Here’s what it looks like inside a pod:
$ cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
What ndots Actually Does
The ndots setting tells the resolver: “If a hostname has fewer than N dots, try appending the search domains first.”
With ndots:5, when your app tries to resolve api.stripe.com (which has 2 dots), the resolver thinks it might be a relative name. So it tries:
api.stripe.com.default.svc.cluster.local→ NXDOMAINapi.stripe.com.svc.cluster.local→ NXDOMAINapi.stripe.com.cluster.local→ NXDOMAINapi.stripe.com→ SUCCESS ✓
That’s 4 DNS queries instead of 1. Each query might take 1-5ms locally, but factor in:
- UDP packet loss and retries
- CoreDNS under load
- Upstream DNS latency
- TCP fallback for truncated responses
Suddenly you’re looking at 100ms-15s of DNS overhead per external hostname.
Seeing It In Action
You can watch this happen with tcpdump:
# In one terminal, start capture
kubectl exec -it debug-pod -- tcpdump -n -i eth0 port 53
# In another, make a request
kubectl exec -it debug-pod -- curl https://api.stripe.com/v1/charges
You’ll see something like:
10:23:01.001 IP 10.1.2.3.45678 > 10.96.0.10.53: A? api.stripe.com.default.svc.cluster.local
10:23:01.003 IP 10.96.0.10.53 > 10.1.2.3.45678: NXDOMAIN
10:23:01.004 IP 10.1.2.3.45678 > 10.96.0.10.53: A? api.stripe.com.svc.cluster.local
10:23:01.006 IP 10.96.0.10.53 > 10.1.2.3.45678: NXDOMAIN
10:23:01.007 IP 10.1.2.3.45678 > 10.96.0.10.53: A? api.stripe.com.cluster.local
10:23:01.009 IP 10.96.0.10.53 > 10.1.2.3.45678: NXDOMAIN
10:23:01.010 IP 10.1.2.3.45678 > 10.96.0.10.53: A? api.stripe.com
10:23:01.015 IP 10.96.0.10.53 > 10.1.2.3.45678: A 54.187.174.169
Four queries for one hostname. Now multiply that by every external service your app calls.
The Fixes
Option 1: Use FQDNs (Quick Fix)
Add a trailing dot to force absolute lookups:
# In your app config
API_ENDPOINT: "api.stripe.com." # Note the trailing dot
The trailing dot tells the resolver “this is a fully qualified domain name – don’t append search domains.”
Pros: Works immediately, no cluster changes Cons: Have to update every external hostname in your config
Option 2: Override ndots Per Pod (Recommended)
Set ndots:2 in your pod spec:
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
dnsConfig:
options:
- name: ndots
value: "2"
containers:
- name: app
image: my-app:latest
With ndots:2, hostnames with 2+ dots (like api.stripe.com) are resolved directly. Internal service names (my-service.default) still work because they have fewer than 2 dots.
For Deployments:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
template:
spec:
dnsConfig:
options:
- name: ndots
value: "2"
containers:
- name: app
image: my-app:latest
Option 3: Use dnsPolicy: ClusterFirstWithHostFallback
For pods that mostly talk to external services:
spec:
dnsPolicy: "Default" # Use node's DNS, not cluster DNS
Or keep cluster DNS but optimise:
spec:
dnsPolicy: "ClusterFirst"
dnsConfig:
options:
- name: ndots
value: "1"
- name: single-request-reopen
value: ""
The single-request-reopen option helps with some DNS race conditions where A and AAAA queries interfere with each other.
Option 4: NodeLocal DNSCache (Cluster-Wide Fix)
For cluster-wide improvement, deploy NodeLocal DNSCache:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml
This runs a DNS cache on every node, dramatically reducing:
- CoreDNS load
- Cross-node DNS traffic
- Lookup latency
Queries hit the local cache first, and NXDOMAIN responses for search domain attempts are cached, making subsequent lookups fast.
The Nuclear Option: Reduce Search Domains
You can override the entire DNS config:
spec:
dnsPolicy: "None"
dnsConfig:
nameservers:
- 10.96.0.10 # CoreDNS
searches:
- default.svc.cluster.local
options:
- name: ndots
value: "2"
This removes svc.cluster.local and cluster.local from the search path. Only do this if you understand the implications – some internal lookups might break.
Debugging DNS Issues
Check Current Settings
kubectl exec -it <pod> -- cat /etc/resolv.conf
Measure DNS Latency
kubectl exec -it <pod> -- time nslookup api.stripe.com
Watch DNS Queries
kubectl exec -it <pod> -- tcpdump -n port 53
Check CoreDNS Logs
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=100
CoreDNS Metrics
If you have Prometheus, check:
coredns_dns_requests_total– Total queriescoredns_dns_responses_total{rcode="NXDOMAIN"}– Failed lookups (the search domain noise)coredns_dns_request_duration_seconds– Latency histogram
A high ratio of NXDOMAIN responses to successful responses indicates the ndots problem.
Why ndots:5?
You might wonder why Kubernetes chose 5 as the default.
It’s because internal service DNS names can have up to 5 dots:
my-service.my-namespace.svc.cluster.local
1 2 3 4 5
Setting ndots:5 ensures that even the longest internal names get the search domain treatment by default.
The assumption is that most lookups are internal. For many workloads, that’s wrong.
Our Standard Configuration
After dealing with this across multiple clusters, here’s our go-to configuration:
# deployment.yaml
spec:
template:
spec:
dnsConfig:
options:
- name: ndots
value: "2"
- name: single-request-reopen
value: ""
containers:
- name: app
# ...
Combined with NodeLocal DNSCache on every cluster.
This gives us:
- Fast external lookups (direct resolution)
- Working internal lookups (search domains for short names)
- Cached NXDOMAIN responses (fast subsequent lookups)
- Reduced CoreDNS load
Summary
| Setting | External Queries | Internal Works | Effort |
|---|---|---|---|
| Default (ndots:5) | 4 per hostname | ✓ | None |
| Trailing dot | 1 per hostname | ✓ | Config changes |
| ndots:2 | 1 per hostname | ✓ | Pod spec change |
| NodeLocal DNS | 1 (cached) | ✓ | Cluster addon |
The fix is simple. The debugging isn’t. If your app is slow and you’ve ruled out the usual suspects, check your DNS. That ndots:5 might be silently killing your latency budget.
Further reading: The Kubernetes DNS specification and CoreDNS documentation cover more edge cases.