Using node local caching on your Kubernetes nodes to reduce CoreDNS traffic


Kubernetes 1.18 was recently released, and with it came a slew of super useful features! One feature that hit GA is node local caching. This allows each node in your cluster to cache DNS queries, reducing load on your primary in-cluster CoreDNS servers. Now that this feature is GA, I wanted to take it for a spin. If you’ve looked at the query logs on an active CoreDNS pod, or dealt with AWS DNS query limits, I’m sure you will appreciate the value this feature brings.

To get this set up, I first downloaded the node local DNS deployment manifest:

$ curl -o localnodecache.tml -L0 https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml

The manifest contains a service account, service, daemonset and config map. The container that is spun up on each node runs CoreDNS, but in caching mode. The caching feature is enabled with the following configuration block, which is part of the config map that was installed above:

cache {
        success  9984 30
        denial   9984 5
        prefetch 500  5
}

The cache block tells CoreDNS how many queries to cache, as well as how long to keep them (TTL). You can also configure CoreDNS to prefetch frequently queried items prior to them expiring! Next, we need to replace three PILLAR variables in the manifest:

$ export localdns="169.254.20.10"

$ export domain="cluster.local"

$ export kubedns="10.96.0.10"

$ sed -i "s/__PILLAR__LOCAL__DNS__/$localdns/g; s/__PILLAR__DNS__DOMAIN__/$domain/g; s/__PILLAR__DNS__SERVER__/$kubedns/g" nodelocaldns.yaml > nodedns.yml

The localdns variable contains the IP address you want your caching coredns instance to listen for queries on. The documentation uses a link local address, but you can use anything you want. It just can’t overlap with existing IPs. Domain contains the Kubernetes domain you you set “clusterDomain” to. And finally, kubedns is the service IP that sits in front of your primary CoreDNS pods. Once the manifest is applied:

$ kubectl apply -f nodedns.yml

You will see a new daemonset, and one caching DNS pod per host:

$ kubectl get ds -n kube-system node-local-dns

NAME             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
node-local-dns   4         4         4       4            4           <none>          3h43m

$ kubectl get po -o wide -n kube-system -l k8s-app=node-local-dns

NAME                   READY   STATUS    RESTARTS   AGE     IP           NODE                 NOMINATED NODE   READINESS GATES
node-local-dns-24knq   1/1     Running   0          3h40m   172.18.0.4   test-worker          <none>           <none>
node-local-dns-fl2zf   1/1     Running   0          3h40m   172.18.0.3   test-worker2         <none>           <none>
node-local-dns-gvqrv   1/1     Running   0          3h40m   172.18.0.5   test-control-plane   <none>           <none>
node-local-dns-v9hlv   1/1     Running   0          3h40m   172.18.0.2   test-worker3         <none>           <none>

One thing I found interesting is how DNS queries get routed to the caching DNS pods. Given a pod with a ClusterFirst policy, the nameserver value in /etc/resolv.conf will get populated with the service IP that sits in front of your in-cluster CoreDNS pods:

$ kubectl get svc kube-dns -o wide -n kube-system

NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE     SELECTOR
kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   5d18h   k8s-app=kube-dns

$ kubectl exec -it nginx-f89759699-2c8zw -- cat /etc/resolv.conf

search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10
options ndots:5

But under the covers, iptables has an OUTPUT chain to route DNS requests destined for your CoreDNS cluster service IP to the IP assigned to the localdns variable. We can view that with the iptables command:

$ iptables -L OUTPUT

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     udp  --  10.96.0.10           anywhere             udp spt:53
ACCEPT     tcp  --  10.96.0.10           anywhere             tcp spt:53
ACCEPT     udp  --  169.254.20.10        anywhere             udp spt:53
ACCEPT     tcp  --  169.254.20.10        anywhere             tcp spt:53
KUBE-SERVICES  all  --  anywhere         anywhere             ctstate NEW /* kubernetes service portals */
KUBE-FIREWALL  all  --  anywhere         anywhere

Pretty neat! Now to test this out. If we exec into a pod:

$ kubectl exec -it nginx-f89759699-59ss2 -- sh

And query the local caching instance:

$ dig +short @169.254.20.10 prefetch.net

67.205.141.207

$ dig +short @169.254.20.10 prefetch.net

67.205.141.207

We get the same results. But if you check the logs, the first request will hit the local caching server, and then be forwarded to your primary CoreDNS service IP. When the second query comes in, the cached entry will be returned to the requester, reducing load on your primary CoreDNS servers. And if your pods are configured to point to the upstream CoreDNS servers, iptables will ensure that query hits the local DNS cache. Pretty sweet! And this all happens through the magic of CoreDNS, IPTables and some awesome developers! This feature rocks!

This article was posted by on 2020-05-15 01:00:00 -0500 -0500