Kubernetes 1.18 was recently released, and with it came a slew of super useful features! One feature that hit GA is node local caching. This allows each node in your cluster to cache DNS queries, reducing load on your primary in-cluster CoreDNS servers. Now that this feature is GA, I wanted to take it for a spin. If you’ve looked at the query logs on an active CoreDNS pod, or dealt with AWS DNS query limits, I’m sure you will appreciate the value this feature brings.
To get this set up, I first downloaded the node local DNS deployment manifest:
$ curl -o localnodecache.tml -L0 https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml
The manifest contains a service account, service, daemonset and config map. The container that is spun up on each node runs CoreDNS, but in caching mode. The caching feature is enabled with the following configuration block, which is part of the config map that was installed above:
cache {
success 9984 30
denial 9984 5
prefetch 500 5
}
The cache block tells CoreDNS how many queries to cache, as well as how long to keep them (TTL). You can also configure CoreDNS to prefetch frequently queried items prior to them expiring! Next, we need to replace three PILLAR variables in the manifest:
$ export localdns="169.254.20.10"
$ export domain="cluster.local"
$ export kubedns="10.96.0.10"
$ sed -i "s/__PILLAR__LOCAL__DNS__/$localdns/g; s/__PILLAR__DNS__DOMAIN__/$domain/g; s/__PILLAR__DNS__SERVER__/$kubedns/g" nodelocaldns.yaml > nodedns.yml
The localdns variable contains the IP address you want your caching coredns instance to listen for queries on. The documentation uses a link local address, but you can use anything you want. It just can’t overlap with existing IPs. Domain contains the Kubernetes domain you you set “clusterDomain” to. And finally, kubedns is the service IP that sits in front of your primary CoreDNS pods. Once the manifest is applied:
$ kubectl apply -f nodedns.yml
You will see a new daemonset, and one caching DNS pod per host:
$ kubectl get ds -n kube-system node-local-dns
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
node-local-dns 4 4 4 4 4 <none> 3h43m
$ kubectl get po -o wide -n kube-system -l k8s-app=node-local-dns
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
node-local-dns-24knq 1/1 Running 0 3h40m 172.18.0.4 test-worker <none> <none>
node-local-dns-fl2zf 1/1 Running 0 3h40m 172.18.0.3 test-worker2 <none> <none>
node-local-dns-gvqrv 1/1 Running 0 3h40m 172.18.0.5 test-control-plane <none> <none>
node-local-dns-v9hlv 1/1 Running 0 3h40m 172.18.0.2 test-worker3 <none> <none>
One thing I found interesting is how DNS queries get routed to the caching DNS pods. Given a pod with a ClusterFirst policy, the nameserver value in /etc/resolv.conf will get populated with the service IP that sits in front of your in-cluster CoreDNS pods:
$ kubectl get svc kube-dns -o wide -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 5d18h k8s-app=kube-dns
$ kubectl exec -it nginx-f89759699-2c8zw -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10
options ndots:5
But under the covers, iptables has an OUTPUT chain to route DNS requests destined for your CoreDNS cluster service IP to the IP assigned to the localdns variable. We can view that with the iptables command:
$ iptables -L OUTPUT
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
ACCEPT udp -- 10.96.0.10 anywhere udp spt:53
ACCEPT tcp -- 10.96.0.10 anywhere tcp spt:53
ACCEPT udp -- 169.254.20.10 anywhere udp spt:53
ACCEPT tcp -- 169.254.20.10 anywhere tcp spt:53
KUBE-SERVICES all -- anywhere anywhere ctstate NEW /* kubernetes service portals */
KUBE-FIREWALL all -- anywhere anywhere
Pretty neat! Now to test this out. If we exec into a pod:
$ kubectl exec -it nginx-f89759699-59ss2 -- sh
And query the local caching instance:
$ dig +short @169.254.20.10 prefetch.net
67.205.141.207
$ dig +short @169.254.20.10 prefetch.net
67.205.141.207
We get the same results. But if you check the logs, the first request will hit the local caching server, and then be forwarded to your primary CoreDNS service IP. When the second query comes in, the cached entry will be returned to the requester, reducing load on your primary CoreDNS servers. And if your pods are configured to point to the upstream CoreDNS servers, iptables will ensure that query hits the local DNS cache. Pretty sweet! And this all happens through the magic of CoreDNS, IPTables and some awesome developers! This feature rocks!