Kubernetes command auto completion

This article was posted by Matty on 2018-02-04 12:46:40 -0500 -0500

The kubectl command has a fair number of commands as well as a number of options available for each command. I recently learned that kubectl has a completion command which can be used along side the [bash|zsh]-completion packages to autocomplete commands for you! To get this working you will first need to install bash-completion:

$ yum -y install bash-completion

Once this package is installed you can run kubectl with the completion command to generate shell completion code:

$ kubectl completion bash >> $HOME/.bashrc

Once you source your .bashrc you can type in kubectl and then use tab to autocomplete the rest of your command line! Nifty!

How the kubectl port-forward command works

This article was posted by Matty on 2018-02-03 16:49:40 -0500 -0500

This afternoon I was digging into the kube-dns service using the kuard demo application from Kubernetes: Up and Running. Kuard has a “Server side DNS query” menu which you can use to resolve a name in a pod and display the results in your browser. To create a kuard instance for testing I ran my trusty old friend kubectl:

$ kubectl run --image=gcr.io/kuar-demo/kuard-amd64:1 kuard

This spun up one pod:

$ kubectl get pods -o wide

NAME                     READY     STATUS    RESTARTS   AGE       IP         NODE
kuard-59f4bf4795-m9bzb   1/1       Running   0          30s       10.1.4.8   kubworker4.prefetch.net

To gain access to kuard I created a port-forward from localhost:8080 to port 8080 in the pod:

$ kubectl port-forward kuard 8080:8080
Forwarding from 127.0.0.1:8080 -> 8080

When I tried to access localhost:8080 in chrome I received the following error:

Handling connection for 8080
E0203 14:12:54.651409   20574 portforward.go:331] an error occurred forwarding 8080 -> 8080: error forwarding port 8080 to pod febaeb6b4747d87036534845214f391db41cda998a592e541d4b6be7ff615ef4, uid : unable to do port forwarding: socat not found.

The errors was pretty self explanatory, the worker didn’t have the socat binary installed. I didn’t recall seeing a pre-requisite for the socat utility so I decided to start digging through the kubelet code to see how port-forward works. After poking around the kubelet source code I came across docker_streaming.go. This file contains a function portForward(…) which contains the port-forward logic. The code for port-forward is actually pretty straight forward and uses socat and nsenter to accomplish its job. First, the function checks to see if socat and nenter exist:

       containerPid := container.State.Pid
        socatPath, lookupErr := exec.LookPath("socat")
        if lookupErr != nil {
                return fmt.Errorf("unable to do port forwarding: socat not found.")
        }

       args := []string{"-t", fmt.Sprintf("%d", containerPid), "-n", socatPath, "-", fmt.Sprintf("TCP4:localhost:%d", port)}

        nsenterPath, lookupErr := exec.LookPath("nsenter")
        if lookupErr != nil {
                return fmt.Errorf("unable to do port forwarding: nsenter not found.")
        }

If both checks pass it will exec() nsenter passing the target process id (the PID of the pause container) to “-t” and the command to run to “-n”:

        commandString := fmt.Sprintf("%s %s", nsenterPath, strings.Join(args, " "))
        glog.V(4).Infof("executing port forwarding command: %s", commandString)

        command := exec.Command(nsenterPath, args...)
        command.Stdout = stream

This can be verified with Brendan Gregg’s execsnoop utility:

$ execsnoop -n nsenter

PCOMM            PID    PPID   RET ARGS
nsenter          25898  976      0 /usr/bin/nsenter -t 4947 -n /usr/bin/socat - TCP4:localhost:8080

I love reading code and have a much better understanding of how port-forward works now. If you utilize this feature to access your workers you will need to make sure nsenter and socat are installed.

Tailing logs from multiple Kubernetes' pods with kubetail

This article was posted by Matty on 2018-02-03 16:46:05 -0500 -0500

Yesterday I spent the afternoon learning about Kubernetes deployments. These are amazingly cool for managing the lifecycle of a set of containers. Deployments allow a set of containers to be scaled up, scaled down, updated new new versions and rolled back to know working versions. All of these actions are performed in a staged fashion to ensure that a given service continues to function while the underlying infrastructure is changing. To oberve how this worked I wanted to be able to watch the logs from all of the pods in a deployment to observe this phenomenon live.

After a bit of searching I came across Johan Haleby’s kubetail shell script. Kubetail allows you to tail the logs from a set of pods based on a label selector, context or container id. It also provides colored output with timestamps and you can utilize the “–since” option to control how far back to retrieve logs.

To show how useful this is let’s spin up 5 pods using the kuard image discussed in Kubernetes: Up and Running (great book!):

$ kubectl run --replicas=5 --image=gcr.io/kuar-demo/kuard-amd64:1 kuard

deployment "kuard" created

$ kubectl get pods

NAME                     READY     STATUS    RESTARTS   AGE
kuard-59f4bf4795-2nljr   1/1       Running   0          0s
kuard-59f4bf4795-8w99t   1/1       Running   0          0s
kuard-59f4bf4795-c8bnz   1/1       Running   0          0s
kuard-59f4bf4795-mgmcl   1/1       Running   0          0s
kuard-59f4bf4795-q9w9v   1/1       Running   0          0s

If I run kubetail with the argument “kuard” it will start tailing the logs from each pod:

$ kubetail kuard

Will tail 5 logs...
kuard-59f4bf4795-2nljr
kuard-59f4bf4795-8w99t
kuard-59f4bf4795-c8bnz
kuard-59f4bf4795-mgmcl
kuard-59f4bf4795-q9w9v
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:17 127.0.0.1:45940 GET /ready/api 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:17 127.0.0.1:45940 GET /ready/api 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:18 127.0.0.1:45940 GET /ready/api 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:19 127.0.0.1:45940 GET /ready/api 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:20 127.0.0.1:45940 GET /ready/api 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:21 127.0.0.1:45944 GET / 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:21 Loading template for index.html 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:21 127.0.0.1:45944 GET /static/css/bootstrap.min.css 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:21 127.0.0.1:45946 GET /built/bundle.js

This is a useful script but there is one downside. It doesn’t pick up new containers as they are created. For that functionality you will need to use kail and stern.

Working around vSphere VAPI request rejected errors

This article was posted by Matty on 2018-02-03 11:03:32 -0500 -0500

This morning I bumped into a weird issue while attempting to provision some new VMs with the terraform vsphere provider. After reviewing my plan I tried to apply the changes and was greeted with the following error:

$ terraform apply kubernetes-worker-additions-plan

Error: Error refreshing state: 1 error(s) occurred:

* provider.vsphere: Error connecting to CIS REST endpoint: Login failed: body: {"type":"com.vmware.vapi.std.errors.service_unavailable","value":{"messages":[{"args":[],"default_message":"Request rejected due to high request rate. Try again later.","id":"com.vmware.vapi.endpoint.highRequestRate"}]}}, status: 503 Service Unavailable

I’ve performed this operation 100s of times in the past and this is the first time I’ve encountered this error. To see what was going on I SSH’ed into my vcenter appliance and poked around the VAPI endpoint logs in /var/log/vmware/vapi/endpoint. The logs contained dozens of GC Allocation Failures:

2018-02-03T15:28:17.547+0000: 16.738: [GC (Allocation Failure) 2018-02-03T15:28:17.555+0000: 16.746: [SoftReference, 0 refs, 0.0000253 secs]2018-02-03T15:28:17.555+0000: 16.746: [WeakReference, 158 refs, 0.0000118 secs]2018-02-03T15:28:17.555+0000: 16.746: [FinalReference, 132 refs, 0.0008342 secs]2018-02-03T15:28:17.556+0000: 16.747: [PhantomReference, 0 refs, 18 refs, 0.0000073 secs]2018-02-03T15:28:17.556+0000: 16.747: [JNI Weak Reference, 0.0000054 secs][PSYoungGen: 42179K->3811K(46080K)] 96734K->60118K(109056K), 0.0094421 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]

As well as a number of Java stack traces. After collecting a support log bundle to see if this is a known issue I bounced the vapi service:

$ service-control --stop vmware-vapi-endpoint

Perform stop operation. vmon_profile=None, svc_names=['vmware-vapi-endpoint'], include_coreossvcs=False, include_leafossvcs=False
Successfully stopped service vapi-endpoint

$ service-control --start vmware-vapi-endpoint

Perform start operation. vmon_profile=None, svc_names=['vmware-vapi-endpoint'], include_coreossvcs=False, include_leafossvcs=False
2018-02-03T15:33:42.253Z   Service vapi-endpoint state STOPPED
Successfully started service vapi-endpoint

$ service-control --status vmware-vapi-endpoint

Running:
 vmware-vapi-endpoint

Once the service restarted I was able to re-run my plan and apply my changes. Now back to our regular programming.

Notes from episode 1 of TGIK: A Quick Tour

This article was posted by Matty on 2018-02-02 13:43:54 -0500 -0500

Over the past few months I’ve been trying to learn everything there is to know about Kubernetes. Kubernetes is an amazing technology for deploying and scaling containers though it comes with a cost. It’s an incredibly complex piece of software and there are a ton of bells and whistles to become familiar with. One way that I’ve found for coming up to speed is Joe Beda’s weekly TGIK live broadcast. This occurs each Friday at 4PM EST and is CHOCK full of fantastic information. In episode one Joe goes into the basics of Kubernetes. You can watch it here:

Here are some of my takeways from the episode:

Heptio put together the building Kubernetes clusters quickstart on AWS.
There is no such thing as a “master”. There is a control plane that consists of the API server, controller and scheduler.
Kubernetes clusters can scale to 5000 nodes but etcd is typically the bottleneck.
Pods consist of one or more containers. Typically a pod will consist of one container but can have several (which is a common approach for logging, debugging and service meshes).
The dry-run and -o options can be used to view the YAML sent from kubectl to the API server:
$ kubectl run ... --dry-run -o yaml
You can use port forwarding to reach an application running in a pod negating the need to open up firewall rules (the example below forward TCP port 8080 locally to port 8080 in the pod):
$ kubectl port-forward webserver 8080:8080
Pods live on a specific node for their lifetime. This was a conscious design decision to address situations where a machine rebooted or a network split causing the node to appear in two places.
Kubernetes was designed to work with arrays and sets of resources vs. singletons.
Deployments are built on top of replica sets. Replica sets ensure that a specified number of pods are running at any given point in time.
Templates define the layout for each pod in a replica set.
Each pod gets a unique name. Here is the format:
deploymentname-hashoftemplate-uniquevalueforeachpod
Deployments can be scaled up with the scale option:
$ kubectl scale deployment kuard --replicas=10
Scale causes the current deployment to be downloaded, edited and sent back to the API server.
You can bring up the current deployment resource with the kubectl edit command. Once you are done editing the the changes will be pushed to the API server:
$ kubectl edit deployment kuard
The set image option can be used to change the image the pods are running in a staged fashion:
$ kubectl set image deployment ...
The rollout undo option can be used to rollback a deployment
$ kubectl rollout undo deployment kuard
The kubefed utility is a tool for dealing with federation
Custom Resource Definitions (CRDS) allow users to create new types of resources without adding a new API server.
Daemonsets allow you to run a set of containers on each machine in a cluster (e.g., flanneld, log aggregator or performance collector).
The node authorizer limits nodes access to the minimum set of resources they need.

Things I need to learn more about:

How services and pods interact with iptables
What is the correct way to deploy and roll out changes to a cluster
Which LoadBalancers are supporter outside of GCE, AWS, etc.
How do LoadBalancers actually work