This past weekend I got to debug a super fun issue! One of my Kubernetes clusters was seeing a slew of ErrImagePull errors. When I logged into one of the Kubernetes workers, the dockerd debug logs showed it had an issue pulling an image, but it didn’t log WHY it couldn’t pull it. Fortunately I use a private container registry, so I figured I could print the registry communications with ssldump. When I ran ssldump with the registries private key, I saw numerous application_data lines, but no data.
It turns out newer versions of TLS use forward secrecy. When forward secrecy is used, the server’s private key is used to encrypt the initial communications, and to negotiate a session key which is used for the remainder of the TLs session. Ssldump doesn’t currently support decrypting communications with the session key, so I needed to come up with plan B to get the HTTP headers. It turns out the sslsplit utility is an ideal solution for this! If you haven’t used it, SSLsplit is a MITM proxy which can be configured to decrypt TLS communications from one or more clients.
SSLsplit is easy to use, but needs a few things in place before it can start decoding TLS record layer messages. First, we need to generate a RootCA certificate and the associated private key. Mkcert makes this super easy:
$ mkcert -install
The new RootCA is used to mint the certificate that sslsplit will present to the client (dockerd in this case). Next, we need to tell the OS (CentOS 7 in this case) to trust the new CA certificate:
$ cd /etc/pki/ca-trust/source/anchors
$ cp /home/vagrant/.local/share/mkcert/rootCA.pem .
$ update-ca-trust
If you skip this step, docker will complain about an unknown certificate authority:
$ docker pull nginx
Error response from daemon: Get https://harbor/v2/: x509: certificate signed by unknown authority
Next we need to restart dockerd to pick up the new RootCA certificate:
$ systemctl restart docker
Now that the foundation is built, we can fire up sslsplit:
$ sudo /usr/local/bin/sslsplit -k /home/vagrant/.local/share/mkcert/rootCA-key.pem -c /home/vagrant/.local/share/mkcert/rootCA.pem -P ssl 0.0.0.0 443 10.10.10.250 443 -D -X /home/vagrant/ssl.pkt -L /home/vagrant/ssl.log
In the example above, I started sslsplit with the “-k” (CA certificate private key), “-c” (CA certificate), “-D” (don’t detach from the terminal and turn on debugging), -X (log packets to file), -L (log content to a text file), and the “-P” option. The “-P” option contains the proxy specification, which takes the following form:
PROTOCOL LISTEN_ADDRESS LISTEN_PORT HOST_TO_FORWARD_TO PORT_TO_FORWARD_TO
And that’s it! Now if you add an entry to /etc/hosts with your container registry name and the local IP, all communications will flow through SSLsplit. Once you gather the information you need, you can remove the host file entry and stop sslsplit. Then you can review the log or feed the packet capture to your favorite decoder to see what’s going on. My issue turned out to be a Harbor configuration issue, which was easily fixed once I reviewed the HTTP requests and responses.
As an SRE, I’m always on the look out for tooling that can help me do my job better. The Kubernetes ecosystem is filled with amazing tools, especially ones that can validate that your clusters and container images are configured in a reliable and secure fashion. One such tool is dockle. If you haven’t heard of it, dockle is a container scanning tool that can be used verify that your containers are adhering to best practices.
To get started with dockle, you can pass the name of a repository and an optional tag as an argument:
$ dockle kindest/node
WARN - CIS-DI-0001: Create a user for the container
* Last user should not be root
WARN - DKL-DI-0006: Avoid latest tag
* Avoid 'latest' tag
INFO - CIS-DI-0005: Enable Content trust for Docker
* export DOCKER_CONTENT_TRUST=1 before docker pull/build
INFO - CIS-DI-0006: Add HEALTHCHECK instruction to the container image
* not found HEALTHCHECK statement
INFO - CIS-DI-0008: Confirm safety of setuid/setgid files
* setgid file: usr/bin/expiry grwxr-xr-x
* setuid file: usr/bin/su urwxr-xr-x
* setuid file: usr/bin/newgrp urwxr-xr-x
* setuid file: usr/bin/chfn urwxr-xr-x
* setuid file: usr/bin/passwd urwxr-xr-x
* setuid file: usr/bin/chsh urwxr-xr-x
* setuid file: usr/bin/mount urwxr-xr-x
* setuid file: usr/bin/umount urwxr-xr-x
* setuid file: usr/bin/gpasswd urwxr-xr-x
* setgid file: usr/bin/chage grwxr-xr-x
* setgid file: usr/sbin/pam_extrausers_chkpwd grwxr-xr-x
* setgid file: usr/sbin/unix_chkpwd grwxr-xr-x
* setgid file: usr/bin/wall grwxr-xr-x
Dockle will then inspect the container image and provide feedback on STDOUT. The output contains the checkpoint that triggered the output, as well as a description of what if found. I really dig the concise output, as well as the ability to ignore warnings and control the exit codes that are produced. It’s easy to add this to your CI/CD pipeline, and is a nice compliment to container scanning tools such as Clair and Trivy. Super cool project!
Over the past few months I’ve been spending some of my spare time trying to understand OAUTH2 and OIDC. At the core of OAUTH2 is the concept of a bearer token. The most common form of bearer token is the JWT (JSON Web Token), which is a string with three hexadecimal components separated by periods (e.g., XXXXXX.YYYYYYYY.ZZZZZZZZ).
There are plenty of online tools available to decode JWTs, but being a command line warrior I wanted something I could use from a bash prompt. While looking into command line JWT decoders, I came across the following gist describing how to do this with jq. After a couple of slight modifications I was super stoked with the following jq incantation (huge thanks Lucas!):
$ jq -R 'split(".") | .[0],.[1] | @base64d | fromjson' <<< $(cat "${JWT}")
{
"typ": "JWT",
"alg": "RS256",
"kid": "XYZ"
}
{
"iss": "prefetch-token-issuer",
"sub": "",
"aud": "prefetch-registry",
"exp, "XYZ"
"nbf": XYZ,
"iat": XYZ,
"jti": "XYZ",
"access": [
{
"type": "repository",
"name": "foo/container",
"actions": [
"pull"
]
}
]
}
To make this readily available, I created a Bourne shell function which passes argument 1 (the JWT) to jq:
jwtd() {
if [[ -x $(command -v jq) ]]; then
jq -R 'split(".") | .[0],.[1] | @base64d | fromjson' <<< "${1}"
echo "Signature: $(echo "${1}" | awk -F'.' '{print $3}')"
fi
}
Once active, you can decode JWTs from the Linux command line with relative ease:
$ jwtd XXXXXX.YYYYYYY.ZZZZZZZ
Huge thanks to Lukas Lihotzki for the AMAZING Gist comment. Incredible work!
Over the past year I’ve rolled out numerous Prometheus exporters to provide visibility into the infrastructure I manage. Exporters are server processes that interface with an application (HAProxy, MySQL, Redis, etc.), and make their operational metrics available through an HTTP endpoint. The nginx_exporter is an exporter for Nginx, and allows you to gather the stub_status metrics in a super easy way.
To use this exporter, you will first need to download the nginx_exporter binary from the projects Github release page. Once downloaded and extracted, you should create a user to run the process as (this isn’t required, but helps to enhance security):
$ useradd -s /bin/false -c 'nginx_exporter service account' -r prometheus
Next, you will need to create a systemd unit file to start and stop the nginx_exporter process:
$ cat /etc/systemd/system/nginx_exporter.service
[Unit]
Description=Prometheus Nginx Exporter
After=network.target
[Service]
ExecStart= /usr/local/bin/nginx-prometheus-exporter -nginx.scrape-uri https://example.com/metrics -web.listen-address=127.0.0.1:9113
Restart=always
User=prometheus
Group=prometheus
[Install]
WantedBy=multi-user.target
$ systemctl enable nginx_exporter && systemctl start nginx_exporter
The example above assumes that you are running the nginx_exporter process on the same server Nginx is running on. To allow the node_exporter to scrape metrics, you will need to add a location stanza similar to the following:
location /metrics {
stub_status on;
access_log off;
allow 127.0.0.1;
deny all;
}
That’s it. You can now test the metrics endpoint with your favorite web utility:
$ curl -s localhost:9113/metrics | grep ^nginx
nginx_connections_accepted 77
nginx_connections_active 6
nginx_connections_handled 77
nginx_connections_reading 0
nginx_connections_waiting 5
nginx_connections_writing 1
nginx_http_requests_total 1513
nginx_up 1
If you are using Nginx+ there are a TON more metrics exposed. But for basic website monitoring, these can prove useful.
I recently came across icdiff. This little gem allows you to see the difference between two files, but what makes it special is its ability to highlight the differences (sdiff, which was my go to diff tool, doesn’t have this feature):
$ icdiff --cols 80 -U 1 -N node_groups.tf node_groups_new.tf
node_groups.tf node_groups_new.tf
10 # `depends_on` causes a re 10 # `depends_on` causes a re
fresh on every run so is usele fresh on every run so is usele
ss here. ss here.
11 # [Re]creating or removing 11 # [Re]creating or removing
these resources will trigger these resources will trigger
recreation of Node Group resou recreation of Node Group resou
rces rces ***something***
12 aws_auth = coalesc 12 aws_auth = coalesc
elist(kubernetes_config_map.aw elist(kubernetes_config_map.aw
s_auth[*].id, [""])[0] s_auth[*].id, [""])[0]
In the example above, icdiff highlighted the keyword “something” on line 11 in column 2. I really dig the highlighting, and its ability to print X lines before and after the match. You can also define the output column size which is helpful when you are working on the command line.