Timing out and killing Linux commands if they don't complete in a certain number of seconds

This article was posted by on 2019-08-06 18:08:57 -0500 -0500

This past weekend I was working on a software project and needed to kill a job if it didn’t complete in a certain amount of time. The coreutils package ships with the timeout utility which is ideal for doing this. To use timeout you pass it a duration to wait and the command to run:

$ timeout 5 /bin/sleep 60 || echo "Failure"

Failure

In the example above timeout will kill /bin/sleep if it doesn’t complete in 5 seconds. Super handy utility!

atexit() stage right. Or how my Python program leaked file descriptors.

This article was posted by Matty on 2018-03-21 06:00:00 -0500 -0500

A year and a half ago I started using prometheus and grafana to graph metric data. This combination is incredibly powerful and I have been absolutely blown away by the amount of actionable intelligence I’ve been able to convey in our dashboards. Prometheus has a number of exporters which can be used to retrieve metric data from remote endpoints and stash it in its highly optimized time series database. There are exporters for MongoDB, Postgres, vSphere, Nginx, HAProxy, JMX as well as numerous other applications. If it’s a popular application there is most likely an exporter for it. If an exporter isn’t available the Prometheus developers have made it crazy easy to develop new ones with Python and Go.

I recently started playing with pyvmomi which is a Python SDK for the VMware vSphere API. The SDK allows you to view and manipulate all aspects of vSphere programmatically. This allows you to automate things like collecting and graphing host and VM performance data, adding disks, putting hosts into maintenance mode or generating common reports. It’s powerful stuff! Being super curious I decided to write a Prometheus VM performance metrics exporter so I could overlay business metrics on top of HTTP responses which could then be laid on top of VM and host performance metrics. After a couple hours of hacking my Flask-based exporter was spitting out VM performance metrics. Or at least that’s what I thought. Periodically the exporter would die and the following stack trace would be written to the console:

1.2.3.4 - - [18/Mar/2018 01:10:48] "GET /metrics HTTP/1.1" 200 -
Traceback (most recent call last):
  File "/usr/lib64/python2.7/SocketServer.py", line 295, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/lib64/python2.7/SocketServer.py", line 321, in process_request
    self.finish_request(request, client_address)
  File "/usr/lib64/python2.7/SocketServer.py", line 334, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib64/python2.7/SocketServer.py", line 651, in __init__
    self.finish()
  File "/usr/lib64/python2.7/SocketServer.py", line 710, in finish
    self.wfile.close()
  File "/usr/lib64/python2.7/socket.py", line 279, in close
    self.flush()
  File "/usr/lib64/python2.7/socket.py", line 303, in flush
    self._sock.sendall(view[write_offset:write_offset+buffer_size])

Leading up to the exception everything looked good. There were 1020 GET /metrics HTTP requests with status code 200s in the access logs and my metrics were flowing into my Prometheus development environment. But right around 1020 requests my exporter fell on its face. Splat! Based on previous dealings with file descriptor exhaustion I started to wonder if my code was leaking file descriptors. To verify my hypothesis I fired up the program and changed into the processes fd (file descriptor) directory:

$ cd /proc/5799/fd

To get a baseline I ran ls:

$ ls -la

lrwx------. 1 matty matty 64 Mar 20 09:33 0 -> /dev/pts/2
lrwx------. 1 matty matty 64 Mar 20 09:33 1 -> /dev/pts/2
lr-x------. 1 matty matty 64 Mar 20 09:33 10 -> /dev/urandom
lrwx------. 1 matty matty 64 Mar 20 09:33 2 -> /dev/pts/2
lrwx------. 1 matty matty 64 Mar 20 09:33 3 -> socket:[99227]
lrwx------. 1 matty matty 64 Mar 20 09:33 4 -> socket:[99228]

Next I ran a simple shell loop to request the /metrics URI 10 times:

for i in $(seq 1 10); do
   echo "GET /metrics HTTP/1.0" | nc localhost 8000
done

Then I ran ls again to compare the current file descriptor count to the one I generated previously:

$ ls -l

lrwx------. 1 matty matty 64 Mar 20 09:33 0 -> /dev/pts/2
lrwx------. 1 matty matty 64 Mar 20 09:33 1 -> /dev/pts/2
lr-x------. 1 matty matty 64 Mar 20 09:33 10 -> /dev/urandom
lrwx------. 1 matty matty 64 Mar 20 09:35 11 -> socket:[102747]
lrwx------. 1 matty matty 64 Mar 20 09:35 12 -> socket:[102765]
lrwx------. 1 matty matty 64 Mar 20 09:35 13 -> socket:[102773]
lrwx------. 1 matty matty 64 Mar 20 09:36 14 -> socket:[102778]
lrwx------. 1 matty matty 64 Mar 20 09:36 15 -> socket:[102785]
lrwx------. 1 matty matty 64 Mar 20 09:36 16 -> socket:[102792]
lrwx------. 1 matty matty 64 Mar 20 09:33 2 -> /dev/pts/2
lrwx------. 1 matty matty 64 Mar 20 09:33 3 -> socket:[99227]
lrwx------. 1 matty matty 64 Mar 20 09:33 4 -> socket:[99228]
lrwx------. 1 matty matty 64 Mar 20 09:35 6 -> socket:[103556]
lrwx------. 1 matty matty 64 Mar 20 09:35 7 -> socket:[102671]
lrwx------. 1 matty matty 64 Mar 20 09:35 8 -> socket:[103572]
lrwx------. 1 matty matty 64 Mar 20 09:35 9 -> socket:[102722]

Sure enough, my code was leaking file descriptors. But where exactly was this occurring? The code created a connection to vCenter when the /metrics end point was scraped and closed it when it was complete (the final code uses persistent connections to avoid the set up and tear down costs). I also verified this by running strace against the server process:

$ strace -e trace=open,socket,connect,close stats.py

socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 6
connect(6, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("1.2.3.4")}, 16) = 0
*** where is close(6)?!?! ***

The socket was set up, the connect completed but a close() wasn’t issued to the file desriptor returned from socket(). When I ran the code under pdb I saw it call the Disconnect() method to close the connection. This method is defined in connect.py:

def Disconnect(si):
   """
   Disconnect (logout) service instance
   @param si: Service instance (returned from Connect)
   """
   # Logout
   __Logout(si)
   SetSi(None)

def SetSi(si):
   """ Set the saved service instance. """

   global _si
   _si = si

So the serviceInstance object should get set to None causing the file descriptor to be closed once the last reference to the object is gone. But for some reason this wasn’t happening. Reviewing the code paths my program was taking revealed this gem:

atexit.register(Disconnect, serviceInstance)

This set off a light in my head. The atexit() method takes an object and a function to call on the object when the program exits. We can see this first hand in atexit.py:

def register(func, *targs, **kargs):
    """register a function to be executed upon normal program termination

    func - function to be called at exit
    targs - optional arguments to pass to func
    kargs - optional keyword arguments to pass to func

    func is returned to facilitate usage as a decorator.
    """
    _exithandlers.append((func, targs, kargs))
    return func

I was calling atexit() each time I created a connection to vCenter which caused the current serviceinstance object to get added to the _exithandlers list. When I called Disconnect it set the object to None but it wasn’t deleted (and the file descriptor wasn’t close()‘ed) since there was a still a reference to it in the _exithandlers list). To verify this I fired up pdb, set a breakpoint on atexit.register() and then dumped _exithandlers after running a couple of tests:

$ python -m pdb ./stats.py

(Pdb) break atexit.register
Breakpoint 2 at /usr/lib64/python2.7/atexit.py:37

(Pdb) c
> /usr/lib64/python2.7/atexit.py(46)register()
-> _exithandlers.append((func, targs, kargs))

(Pdb) s
> /usr/lib64/python2.7/atexit.py(47)register()
-> return func

(Pdb) print _exithandlers
[(<function shutdown at 0x7f50c0e69b90>, (), {}), (<function Disconnect at 0x7f50c03fe7d0>, ('vim.ServiceInstance:ServiceInstance',), {}), (<function Disconnect at 0x7f50c03fe7d0>, ('vim.ServiceInstance:ServiceInstance',), {}), (<function Disconnect at 0x7f50c03fe7d0>, ('vim.ServiceInstance:ServiceInstance',), {}), (<function Disconnect at 0x7f50c03fe7d0>, ('vim.ServiceInstance:ServiceInstance',), {})]

Bingo! My code was leaking file descriptors because atexit() was leaving around references to the object that encapsulated the file descriptor associated with the socket. Commenting out atexit() fixed my issue and the code is now working splendidly. These types of issues pop up during development and it’s incredibly fun debugging them! I learned a ton about the Python debugger during this debugging session and became much more familiar with the code for various Python modules. I also got to read the vast majority of the PyVim and pyVmomi source code which made developing this a snap! atexit(), stage right.

Notes from episode 14 of TGIK: Serverless with OpenFaaS

This article was posted by Matty on 2018-03-12 09:00:00 -0500 -0500

Over the past few months I’ve been trying to learn everything there is to know about Kubernetes. Kubernetes is an amazing technology for deploying and scaling containers though it comes with a cost. It’s an incredibly complex piece of software and there are a ton of bells and whistles to become familiar with. One way that I’ve found for coming up to speed is Joe Beda’s weekly TGIK live broadcast. This occurs each Friday at 4PM EST and is CHOCK full of fantastic information. In episode fourteen Joe discusses running serverless applications with OpenFAAS. You can watch it here:

Here are some of my takeways from the episode:

Brigade is a Microsoft project to provide event driver scripting for Kubernetes. Uses Javascript to express your workflows.
Draft can be used to build and deploy applications to Kubernetes. This is still in beta and allows developers to test their code in Kubernetes before they commit to version control.
Brad Geesaman has a great talk on Hacking and Hardening Kubernetes Clusters By Example.
OpenFAAS integrates with Kubernetes but can also be used outside of it.
Contains both a synchronous and asynchronous execution stack. Provides this through two endpoints.
Helm is the preferred way to install OpenFAAS.
Namespaces aren’t defined if you attempt to install OpenFAAS by hand.
Uses NodePorts by default to minimize exposure.
Installs prometheus for monitoring. This also consumes a nodeport.
OpenFAAS manual deployments need 3 files which are provided by faas-netes:
- faas
- monitoring
- rbac
Interact with FAAS through the fas-cli command line utility or through the API.
Kubectl port-forward only works with pods. It doesn’t work with services and node ports.
API Gateway and UI are hosted on the same NodePort.
OpenFAAS creates a docker image to run your serverless code.
Some cloud providers allow you to add annotations to restrict access to services.
Providers provide access to the underlying compute platform. Functions contain the code you want to run.
The faas-cli update flag can be used to perform rolling updates vs. deleting and re-creating.
If GitHub stars mean anything to you this is a super popular choice (over 9k stars to date).
The borglet at Google was run in a container and given additional priority to ensure that other workloads couldn’t impact it.
OpenFAAS supports function chaining which is crazy useful.
Uses deployments and services for data.
Pushes a docker image that contains your code. The fass-cli build option is used to create these containers.
A docker native approach to serverless. Build and push images.
Zero to one autoscaling is in the proof of concept stage.
OpenFAAS implements their own autoscaling solution.

Things I need to learn more about:

Need to better understand how to properly structure serverless workflows.

Notes from episode 13 of TGIK: Serverless with Fission

This article was posted by Matty on 2018-03-11 08:00:00 -0500 -0500

Over the past few months I’ve been trying to learn everything there is to know about Kubernetes. Kubernetes is an amazing technology for deploying and scaling containers though it comes with a cost. It’s an incredibly complex piece of software and there are a ton of bells and whistles to become familiar with. One way that I’ve found for coming up to speed is Joe Beda’s weekly TGIK live broadcast. This occurs each Friday at 4PM EST and is CHOCK full of fantastic information. In episode thirteen Joe discusses running serverless applications with fission. You can watch it here:

Here are some of my takeways from the episode:

Fission is a fast serverless platform to run functions on top of Kubernetes.
The grafeas project was created by a number of large organizations to provide a uniform way to audit and govern your software supply chain.
Grafeas keeps track of where your artifacts came from and allows you to track what is running when and where.
Moby is the community project that was recently spawned to govern docker development.
Linuxkit is a toolkit for building secure, portable and lean operating systems for containers.
Infrakit is a toolkit for creating and managing declaractive self-healing infrastructure.
Libnetwork provides a native Go implementation for connecting containers.
Notary’s are used to sign and verify container images.
Helm is the suggested way to deploy fission.
Fission provides two versions:
- Minimal
- Full (contains NATs streaming and InfluxDB and FluentD for logs)
The default fission installation doesn’t utilize RBAC.
Helm uses go templates to speclialize the YAML that is created and sent to Kubernetes.
Fission makes liberal use of ClusterRoles.
Fission provisions a number of services and resources:
- NAT streaming
- Timer service
- InfluxDB
- Fluentd
- Router
- Builder manager
- Controller
- Kubernetes watcher
- Pool manager
- Logger daemon
The logger daemon pulls logs from the host it’s running on.
NATs is a simple high performance messaging system.
NATs streaming buids on top of NATs and adds data streaming and persistent and durable storage
Streaming uses a single replica so its possible to lose messages.
You interact with fission through the fission command line client.
The Kubernetes jsonpath option can be used to retrieve arbitrary elements from a JSON document:
$ kubectl get nodes -o jsonpath='{.items[*].status.addresses[?(@.type=="InternalIP")].address}'
Authentication and multi tenancy support will be added to Fission in the future.
Fission uses 3rd party resoources which have been deprecated in favor of CRDs.
Fission advertises a 100ms cold start time.
Fission keeps a pool of containers available to run your code.
Couple of interesting notes from Joe:
- Things are more centralized with Fission.
- Runs in the Kubernetes security context which gives it wide access.
- Somewhat difficult to control and segment resources.
Initializers allow you to set up the runtime environment (e.g., add an audting sidecar) prior to the container starting up.
Fission builders are used to build the code that gets executed.
Fission Workflows provide a way to control the execution flow of your functions.
Fission doesn’t use horizontal pod autoscaling. It implements it’s own approach.
The Kubernetes metacontroller is a serverless-like approach to creating controllers.

Things I need to learn more about:

Read up on the Kubernetes metacontroller.
See what options are available to perform blue-green deployments.

Notes from episode 12 of TGIK: Exploring serverless with Kubeless

This article was posted by Matty on 2018-03-10 08:00:00 -0500 -0500

Over the past few months I’ve been trying to learn everything there is to know about Kubernetes. Kubernetes is an amazing technology for deploying and scaling containers though it comes with a cost. It’s an incredibly complex piece of software and there are a ton of bells and whistles to become familiar with. One way that I’ve found for coming up to speed is Joe Beda’s weekly TGIK live broadcast. This occurs each Friday at 4PM EST and is CHOCK full of fantastic information. In episode twelve Joe discusses running serverless applications with kubeless. You can watch it here:

Here are some of my takeways from the episode:

Kubeless is a native serverless framework that lets you run small chunks of code (e.g., a function to perform a map reduce function on a data set and returning the result).
Kubernetes office hours allow you to interact with the gurus in the Kubernetes community. I attended my first session this week and and it was a great learning experience!
Heptio labs developed the eventrouter which allows you to watch for events and push them to user specified locations (called sinks).
Events are stored in etcd.
KubeVirt is a virtual machine management add-on for Kubernetes.
Kubeless uses custom resource definitions (CRDs).
The kubelet command and Kubernetes API server can be different versions because the APIs are versioned and kubectl supports a variety of versions.
You need to define a storage class before provisioning kubeless.
Kubeless utilizes Kafka and Zookeeper under the covers.
Joe mentioned that multiple services can be used to implement some super cool things:
- Blue-Green deployments
- Canaries
- Speciailized request routing
You can show persistent volume claims with the pvc command:
$ kubectl get pvc
Storage classes allow you describe the types of storage available to a cluster. Common examples are “fast” for SSD storage and “slow” for spinning media.
You can view the storage classes with the sc command:
$ kubectl get sc
The storageClassName field defines the storage class to use.
You can define a default storage class to satisfy requests that don’t define a storage class.
Storage classes have cluster-wide scope.
Dynamic storage provisioning allows you to create persistent volumes on the fly.
The etcd service limits config maps to 1MB in size.
Kubeless and kubectl use different formats:
$ kubeless <noun> <verb>
$ kubectl <verb> <noun>
There are several other projects providing Kubernetes serverless solutions:
- Fission
- OpenFAAS
Kubeless has an ingress command to create the ingress controller used to interact with kubeless.
Kubeless runs code in your namespace and context.
You can show Ingress definitions with kubectl get ingresses command:
$ kubectl get ingresses <name> -o yaml
Horizontal pod autoscaling allows Kubernetes to scale your pods based on the overall system state.
Scaling based on CPU and memory is problematic.

Things I need to learn more about:

What are the pitfalls of scaling based on CPU and memory.
How can you utilize Ingress LoadBalancers outside of a cloud provider.