Blog O' Matty


Kubernetes command auto completion

This article was posted by Matty on 2018-02-04 12:46:40 -0500 -0500

The kubectl command has a fair number of commands as well as a number of options available for each command. I recently learned that kubectl has a completion command which can be used along side the [bash|zsh]-completion packages to autocomplete commands for you! To get this working you will first need to install bash-completion:

$ yum -y install bash-completion

Once this package is installed you can run kubectl with the completion command to generate shell completion code:

$ kubectl completion bash >> $HOME/.bashrc

Once you source your .bashrc you can type in kubectl and then use tab to autocomplete the rest of your command line! Nifty!

How the kubectl port-forward command works

This article was posted by Matty on 2018-02-03 16:49:40 -0500 -0500

This afternoon I was digging into the kube-dns service using the kuard demo application from Kubernetes: Up and Running. Kuard has a “Server side DNS query” menu which you can use to resolve a name in a pod and display the results in your browser. To create a kuard instance for testing I ran my trusty old friend kubectl:

$ kubectl run --image=gcr.io/kuar-demo/kuard-amd64:1 kuard

This spun up one pod:

$ kubectl get pods -o wide

NAME                     READY     STATUS    RESTARTS   AGE       IP         NODE
kuard-59f4bf4795-m9bzb   1/1       Running   0          30s       10.1.4.8   kubworker4.prefetch.net

To gain access to kuard I created a port-forward from localhost:8080 to port 8080 in the pod:

$ kubectl port-forward kuard 8080:8080
Forwarding from 127.0.0.1:8080 -> 8080

When I tried to access localhost:8080 in chrome I received the following error:

Handling connection for 8080
E0203 14:12:54.651409   20574 portforward.go:331] an error occurred forwarding 8080 -> 8080: error forwarding port 8080 to pod febaeb6b4747d87036534845214f391db41cda998a592e541d4b6be7ff615ef4, uid : unable to do port forwarding: socat not found.

The errors was pretty self explanatory, the worker didn’t have the socat binary installed. I didn’t recall seeing a pre-requisite for the socat utility so I decided to start digging through the kubelet code to see how port-forward works. After poking around the kubelet source code I came across docker_streaming.go. This file contains a function portForward(…) which contains the port-forward logic. The code for port-forward is actually pretty straight forward and uses socat and nsenter to accomplish its job. First, the function checks to see if socat and nenter exist:

       containerPid := container.State.Pid
        socatPath, lookupErr := exec.LookPath("socat")
        if lookupErr != nil {
                return fmt.Errorf("unable to do port forwarding: socat not found.")
        }

       args := []string{"-t", fmt.Sprintf("%d", containerPid), "-n", socatPath, "-", fmt.Sprintf("TCP4:localhost:%d", port)}

        nsenterPath, lookupErr := exec.LookPath("nsenter")
        if lookupErr != nil {
                return fmt.Errorf("unable to do port forwarding: nsenter not found.")
        }

If both checks pass it will exec() nsenter passing the target process id (the PID of the pause container) to “-t” and the command to run to “-n”:

        commandString := fmt.Sprintf("%s %s", nsenterPath, strings.Join(args, " "))
        glog.V(4).Infof("executing port forwarding command: %s", commandString)

        command := exec.Command(nsenterPath, args...)
        command.Stdout = stream

This can be verified with Brendan Gregg’s execsnoop utility:

$ execsnoop -n nsenter

PCOMM            PID    PPID   RET ARGS
nsenter          25898  976      0 /usr/bin/nsenter -t 4947 -n /usr/bin/socat - TCP4:localhost:8080

I love reading code and have a much better understanding of how port-forward works now. If you utilize this feature to access your workers you will need to make sure nsenter and socat are installed.

Tailing logs from multiple Kubernetes' pods with kubetail

This article was posted by Matty on 2018-02-03 16:46:05 -0500 -0500

Yesterday I spent the afternoon learning about Kubernetes deployments. These are amazingly cool for managing the lifecycle of a set of containers. Deployments allow a set of containers to be scaled up, scaled down, updated new new versions and rolled back to know working versions. All of these actions are performed in a staged fashion to ensure that a given service continues to function while the underlying infrastructure is changing. To oberve how this worked I wanted to be able to watch the logs from all of the pods in a deployment to observe this phenomenon live.

After a bit of searching I came across Johan Haleby’s kubetail shell script. Kubetail allows you to tail the logs from a set of pods based on a label selector, context or container id. It also provides colored output with timestamps and you can utilize the “–since” option to control how far back to retrieve logs.

To show how useful this is let’s spin up 5 pods using the kuard image discussed in Kubernetes: Up and Running (great book!):

$ kubectl run --replicas=5 --image=gcr.io/kuar-demo/kuard-amd64:1 kuard

deployment "kuard" created

$ kubectl get pods

NAME                     READY     STATUS    RESTARTS   AGE
kuard-59f4bf4795-2nljr   1/1       Running   0          0s
kuard-59f4bf4795-8w99t   1/1       Running   0          0s
kuard-59f4bf4795-c8bnz   1/1       Running   0          0s
kuard-59f4bf4795-mgmcl   1/1       Running   0          0s
kuard-59f4bf4795-q9w9v   1/1       Running   0          0s

If I run kubetail with the argument “kuard” it will start tailing the logs from each pod:

$ kubetail kuard

Will tail 5 logs...
kuard-59f4bf4795-2nljr
kuard-59f4bf4795-8w99t
kuard-59f4bf4795-c8bnz
kuard-59f4bf4795-mgmcl
kuard-59f4bf4795-q9w9v
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:17 127.0.0.1:45940 GET /ready/api 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:17 127.0.0.1:45940 GET /ready/api 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:18 127.0.0.1:45940 GET /ready/api 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:19 127.0.0.1:45940 GET /ready/api 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:20 127.0.0.1:45940 GET /ready/api 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:21 127.0.0.1:45944 GET / 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:21 Loading template for index.html 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:21 127.0.0.1:45944 GET /static/css/bootstrap.min.css 
[kuard-59f4bf4795-2nljr] 2018/02/04 14:35:21 127.0.0.1:45946 GET /built/bundle.js 

This is a useful script but there is one downside. It doesn’t pick up new containers as they are created. For that functionality you will need to use kail and stern.

Working around vSphere VAPI request rejected errors

This article was posted by Matty on 2018-02-03 11:03:32 -0500 -0500

This morning I bumped into a weird issue while attempting to provision some new VMs with the terraform vsphere provider. After reviewing my plan I tried to apply the changes and was greeted with the following error:

$ terraform apply kubernetes-worker-additions-plan

Error: Error refreshing state: 1 error(s) occurred:

* provider.vsphere: Error connecting to CIS REST endpoint: Login failed: body: {"type":"com.vmware.vapi.std.errors.service_unavailable","value":{"messages":[{"args":[],"default_message":"Request rejected due to high request rate. Try again later.","id":"com.vmware.vapi.endpoint.highRequestRate"}]}}, status: 503 Service Unavailable

I’ve performed this operation 100s of times in the past and this is the first time I’ve encountered this error. To see what was going on I SSH’ed into my vcenter appliance and poked around the VAPI endpoint logs in /var/log/vmware/vapi/endpoint. The logs contained dozens of GC Allocation Failures:

2018-02-03T15:28:17.547+0000: 16.738: [GC (Allocation Failure) 2018-02-03T15:28:17.555+0000: 16.746: [SoftReference, 0 refs, 0.0000253 secs]2018-02-03T15:28:17.555+0000: 16.746: [WeakReference, 158 refs, 0.0000118 secs]2018-02-03T15:28:17.555+0000: 16.746: [FinalReference, 132 refs, 0.0008342 secs]2018-02-03T15:28:17.556+0000: 16.747: [PhantomReference, 0 refs, 18 refs, 0.0000073 secs]2018-02-03T15:28:17.556+0000: 16.747: [JNI Weak Reference, 0.0000054 secs][PSYoungGen: 42179K->3811K(46080K)] 96734K->60118K(109056K), 0.0094421 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]

As well as a number of Java stack traces. After collecting a support log bundle to see if this is a known issue I bounced the vapi service:

$ service-control --stop vmware-vapi-endpoint

Perform stop operation. vmon_profile=None, svc_names=['vmware-vapi-endpoint'], include_coreossvcs=False, include_leafossvcs=False
Successfully stopped service vapi-endpoint

$ service-control --start vmware-vapi-endpoint

Perform start operation. vmon_profile=None, svc_names=['vmware-vapi-endpoint'], include_coreossvcs=False, include_leafossvcs=False
2018-02-03T15:33:42.253Z   Service vapi-endpoint state STOPPED
Successfully started service vapi-endpoint

$ service-control --status vmware-vapi-endpoint

Running:
 vmware-vapi-endpoint

Once the service restarted I was able to re-run my plan and apply my changes. Now back to our regular programming.

Notes from episode 1 of TGIK: A Quick Tour

This article was posted by Matty on 2018-02-02 13:43:54 -0500 -0500

Over the past few months I’ve been trying to learn everything there is to know about Kubernetes. Kubernetes is an amazing technology for deploying and scaling containers though it comes with a cost. It’s an incredibly complex piece of software and there are a ton of bells and whistles to become familiar with. One way that I’ve found for coming up to speed is Joe Beda’s weekly TGIK live broadcast. This occurs each Friday at 4PM EST and is CHOCK full of fantastic information. In episode one Joe goes into the basics of Kubernetes. You can watch it here:

Here are some of my takeways from the episode:

Things I need to learn more about: