The importance of the C asm volatile statement

This article was posted by on 2022-04-07 00:00:00 -0500 -0500

Last month I started a course that teaches you how to write your own Operating System. Working at the intersection of hardware and software (X86 Assembly and C) has been incredibly rewarding. I’ve learned a TON! One interesting thing I came across in the Linux kernel’s bootloader code is the use of “asm volatile”. Here is a snippet from $SRC_DIR/linux-5.16/arch/x86/boot/boot.h:

#define cpu_relax() asm volatile("rep; nop")

The history behind this is super interesting, and it’s used to force the compiler’s optimizer to execute the code AS IS. Being somewhat curious, I wrote a simple C program to see this in action:

#define cpu_relax() asm volatile("rep; nop")

int main(int argc, char **argv) {
    cpu_relax();
}

When I compiled it, I was a bit surprised that the instructions above didn’t show up verbatim in the objdump output, but they did when the binary was compiled with the gcc create ASM option:

$ gcc -o test test.c

$ objdump -d -j .text test

0000000000001129 <main>:
    1129: f3 0f 1e fa           endbr64
    112d: 55                    push   %rbp
    112e: 48 89 e5              mov    %rsp,%rbp
    1131: 89 7d fc              mov    %edi,-0x4(%rbp)
    1134: 48 89 75 f0           mov    %rsi,-0x10(%rbp)
    1138: f3 90                 pause
    113a: b8 00 00 00 00        mov    $0x0,%eax
    113f: 5d                    pop    %rbp
    1140: c3                    retq
    1141: 66 2e 0f 1f 84 00 00  nopw   %cs:0x0(%rax,%rax,1)
    1148: 00 00 00
    114b: 0f 1f 44 00 00        nopl   0x0(%rax,%rax,1)

$ gcc -S test.c

main:
.LFB0:
  .cfi_startproc
  endbr64
  pushq %rbp
  .cfi_def_cfa_offset 16
  .cfi_offset 6, -16
  movq  %rsp, %rbp
  .cfi_def_cfa_register 6
  movl  %edi, -4(%rbp)
  movq  %rsi, -16(%rbp)
#APP
# 4 "hi.c" 1
#   rep; nop

This one had me stumped, but luckily a super awesome friend of mine had a great theory. Objdump was most likely using a synthetic instruction (pause) to replace the rep and nop. Since objdump is interpreting the ELF binary, and gcc is creating assembly from source code, this totally makes sense. One of these days I need to study how gcc and company optimize code. This is a fascinating topic!

Diving into container images

This article was posted by on 2022-03-02 09:00:00 -0500 -0500

Container images are one of the items that makes up a “container.” In most cases container images use a base image (e.g., Alpine, Ubuntu, etc.), and then one or more application-specific layers are added on top of that. There are numerous documented best practices for optimizing container images, and these best practices result in smaller images, less network traffic, and a reduction in container creation time.

Unfortunately in practice, I’ve seen numerous cases were these best practices weren’t followed. I’ve come across Dockerfiles that used dozens of RUN commands, didn’t take advantage of multistaged builds, didn’t optimize for image layer re-use, etc. When I’ve encountered these types of issues, I’ve always taken it upon myself to work with the author to refactor the build instructions, and to educate them on best practices. The hadolint project maintains an excellent set of Docker best practice rules, which I highly suggest reviewing if you haven’t already.

When situations pop up where I need to dig into a container image, I always turn to my good buddy dive. This amazing little utility allows you to analyze a container image in a console TUI. Exploring a container image with dive is super easy. Type dive into your terminal, pass the container image you want to explore as an argumnet, and the TUI will be displayed:

$ dive nginx:latest

One of the most useful screens in the TUI is the image details pane:

Image name: nginx
Total Image size: 142 MB
Potential wasted space: 3.8 MB
Image efficiency score: 98 %

This shows the image size, how much space is wasted, and the containers efficiency score. The efficiency score is super helpful for understanding where the image lies on the efficiency spectrum, and if further analysis would be beneficial. Dive was developed with CI in mind. You can run dive with the “–highestUserWastedPercent” and “–highestWastedBytes” arguments as part of a CI pipeline. If the image that is created doesn’t past muster, you can fail the build until someone reviews the build instructions.

If you come across an image with a poor efficiency score while debugging an issue, you can review the container’s build statements (Dockerfiles, JIB XML stanzas, etc.) in source control, or generate these on the fly with the docker history command:

$ docker history --no-trunc nginx:latest

I love debugging problems, especially ones where the end result is added efficiency. Giddie up!

Why df fails to show one or more file systems when run as an unprivileged user

This article was posted by on 2022-03-01 09:00:00 -0500 -0500

One of my friends recently reached out with a fun problem. His monitoring system was periodically not firing when file systems grew past the thresholds he defined. When we hopped on one of his EC2 instances to debug the issue, I noticed that we were getting a permission denied (EACCES) errno when running df as their monitoring user:

$ df -h /vault/data

df: ‘/vault/data’: Permission denied

When we ran the same command as trusty UID 0, everything worked as expected:

$ sudo df -h /vault/data

Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme1n1     20G    1G   20G   1% /vault/data

A quick check with strace verified this as well:

$ strace -e trace=statfs df -h 2>&1 | grep vault

statfs("/vault/data", 0x7ffe00af8aa0)   = -1 EACCES (Permission denied)

If you aren’t familiar with statfs(2), it returns information about a mounted file system in a statfs structure. Here is a blurb from the manual page describing which information is returned:

The function statfs() returns information about a mounted file system. path is the pathname of any file within the mounted file system. buf is a pointer to a statfs structure defined approximately as follows:

           struct statfs {
               __SWORD_TYPE f_type;    /* type of file system (see below) */
               __SWORD_TYPE f_bsize;   /* optimal transfer block size */
               fsblkcnt_t   f_blocks;  /* total data blocks in file system */
               fsblkcnt_t   f_bfree;   /* free blocks in fs */
               fsblkcnt_t   f_bavail;  /* free blocks available to
                                          unprivileged user */
               fsfilcnt_t   f_files;   /* total file nodes in file system */
               fsfilcnt_t   f_ffree;   /* free file nodes in fs */
               fsid_t       f_fsid;    /* file system id */
               __SWORD_TYPE f_namelen; /* maximum length of filenames */
               __SWORD_TYPE f_frsize;  /* fragment size (since Linux 2.6) */
               __SWORD_TYPE f_spare[5];
           };

I thought that df was setuid root like the mount uility, so when I initially saw the permission denied error I thought it was something unrelated to permissions. But low and behold it was indeed due to df not being setuid root:

$ ls -la /usr/bin/df

-rwxr-xr-x 1 root root 100856 Jan 23  2020 /usr/bin/df

So when df tried to statfs() this file system as an unprivileged user, it got the permission denied error. We found a simple wrokaround to get things working, and I learned something new in the process. Neato!

Using the Kubernetes K14S kapp utility to view deployment manifest changes prior to applying them

This article was posted by on 2020-08-14 01:00:00 -0500 -0500

If you’ve worked with Kubernetes for any length of time, you are probably intimately familiar with deployment manifests. If this concept is new to you, deployment manifests are used to add resources to a cluster in a declarative manor. Some of the larger projects (cert-manager, Istio, CNI plug-ins, etc.) in the Kubernetes ecosystem provide manifests to deploy the resources that make their application work. These can often be 1000s of lines, and if you are security conscious you don’t want to deploy anything to a cluster without validating what it is.

The K14S project took this issue to heart when they released the kapp utility. This super useful utility can help you see the changes that would take place to a cluster, but without actually making any changes. To show how useful this is, lets say you wanted to see which resource Istio would deploy. You can see this with kapp deploy:

$ kapp deploy -a istio -f <(kustomize build)

Target cluster 'https://127.0.0.1:33783' (nodes: test-control-plane, 3+)

Changes

Namespace       Name                             Kind                      Conds.  Age  Op      Op st.  Wait to    Rs  Ri  
(cluster)       istio-operator                   ClusterRole               -       -    create  -       reconcile  -   -  
^               istio-operator                   ClusterRoleBinding        -       -    create  -       reconcile  -   -  
^               istio-operator                   Namespace                 -       -    create  -       reconcile  -   -  
^               istio-system                     Namespace                 -       -    create  -       reconcile  -   -  
^               istiooperators.install.istio.io  CustomResourceDefinition  -       -    create  -       reconcile  -   -  
istio-operator  istio-operator                   Deployment                -       -    create  -       reconcile  -   -  
^               istio-operator                   ServiceAccount            -       -    create  -       reconcile  -   -  
^               istio-operator-metrics           Service                   -       -    create  -       reconcile  -   -  

Op:      8 create, 0 delete, 0 update, 0 noop
Wait to: 8 reconcile, 0 delete, 0 noop

Continue? [yN]: N

The output contains the resource type and the operation that will take place. In the example above we are going to create 8 resources, and assign the application name “istio” (a label) to each resource. Kapp deploy can also be fed the “–diff-changes” option to display a diff between the manifests and the current cluster state, “–allow-ns” to specify the namespaces that the app has to go into, and the “–into-ns” to map the namespaces in the manifests to one of your choosing. Kapp will assign a label to the resources it deploys, which is used by “list” to show resources that are managed by kapp:

$ kapp list

Target cluster 'https://127.0.0.1:33783' (nodes: test-control-plane, 3+)

Apps in namespace 'default'

Name   Namespaces                Lcs   Lca  
istio  (cluster),istio-operator  true  4d  
nginx  -                         -     -  

Lcs: Last Change Successful
Lca: Last Change Age

2 apps

Succeeded

Another super useful feature of kapp is its ability to inspect an application that was previously deployed:

$ kapp inspect -a istio --tree

Target cluster 'https://127.0.0.1:33783' (nodes: test-control-plane, 3+)

Resources in app 'istio'

Namespace       Name                                  Kind                      Owner    Conds.  Rs  Ri  Age  
(cluster)       istio-operator                        ClusterRole               kapp     -       ok  -   4d  
istio-operator  istio-operator                        ServiceAccount            kapp     -       ok  -   4d  
(cluster)       istiooperators.install.istio.io       CustomResourceDefinition  kapp     2/2 t   ok  -   4d  
istio-operator  istio-operator-metrics                Service                   kapp     -       ok  -   4d  
istio-operator   L istio-operator-metrics             Endpoints                 cluster  -       ok  -   4d  
(cluster)       istio-operator                        ClusterRoleBinding        kapp     -       ok  -   4d  
(cluster)       istio-system                          Namespace                 kapp     -       ok  -   4d  
(cluster)       istio-operator                        Namespace                 kapp     -       ok  -   4d  
istio-operator  istio-operator                        Deployment                kapp     2/2 t   ok  -   4d  
istio-operator   L istio-operator-77d57c5c57          ReplicaSet                cluster  -       ok  -   4d  
istio-operator   L.. istio-operator-77d57c5c57-dkl8b  Pod                       cluster  4/4 t   ok  -   4d  

Rs: Reconcile state
Ri: Reconcile information

11 resources

Succeeded

In the output above you can see the resource relationships in tree form, the object type, the owner, and the state of the resource. This is a crazy useful utility, and one I’ve started to use almost daily. It’s super useful for observing the state of a cluster, and for debugging problems. Thanks K14S for this amazing piece of software!

Upgrading an RPM to a specific version with yum

This article was posted by on 2020-08-14 00:00:00 -0500 -0500

This past week I got to spend some time upgrading my CI/CD systems. The Gitlab upgrade process requires stepping to a specific version when you upgrade major versions, which can be a problem if the latest version isn’t supported by the upgrade scripts . In these types of situations, you can tell yum to upgrade to a specific version. To list the versions of a package that are available, you can use the search commands “–showduplicates” option:

$ yum search --showduplicates gitlab-ee | grep 13.0

gitlab-ee-13.0.0-ee.0.el7.x86_64 : GitLab Enterprise Edition (including NGINX,
gitlab-ee-13.0.1-ee.0.el7.x86_64 : GitLab Enterprise Edition (including NGINX,
gitlab-ee-13.0.3-ee.0.el7.x86_64 : GitLab Enterprise Edition (including NGINX,
gitlab-ee-13.0.4-ee.0.el7.x86_64 : GitLab Enterprise Edition (including NGINX,
gitlab-ee-13.0.5-ee.0.el7.x86_64 : GitLab Enterprise Edition (including NGINX,
gitlab-ee-13.0.6-ee.0.el7.x86_64 : GitLab Enterprise Edition (including NGINX,
gitlab-ee-13.0.7-ee.0.el7.x86_64 : GitLab Enterprise Edition (including NGINX,
gitlab-ee-13.0.8-ee.0.el7.x86_64 : GitLab Enterprise Edition (including NGINX,
gitlab-ee-13.0.9-ee.0.el7.x86_64 : GitLab Enterprise Edition (including NGINX,
gitlab-ee-13.0.10-ee.0.el7.x86_64 : GitLab Enterprise Edition (including NGINX,
gitlab-ee-13.0.12-ee.0.el7.x86_64 : GitLab Enterprise Edition (including NGINX,

Once you eye the version you want, you can pass it to yum install:

$ yum install gitlab-ee-13.0.12-ee.0.el7.x86_64

This can also be useful if you want to stick to a minor version vs. upgrading to a new major release.