The importance of the C asm volatile statement


Last month I started a course that teaches you how to write your own Operating System. Working at the intersection of hardware and software (X86 Assembly and C) has been incredibly rewarding. I’ve learned a TON! One interesting thing I came across in the Linux kernel’s bootloader code is the use of “asm volatile”. Here is a snippet from $SRC_DIR/linux-5.16/arch/x86/boot/boot.h:

#define cpu_relax() asm volatile("rep; nop")

The history behind this is super interesting, and it’s used to force the compiler’s optimizer to execute the code AS IS. Being somewhat curious, I wrote a simple C program to see this in action:

#define cpu_relax() asm volatile("rep; nop")

int main(int argc, char **argv) {
    cpu_relax();
}

When I compiled it, I was a bit surprised that the instructions above didn’t show up verbatim in the objdump output, but they did when the binary was compiled with the gcc create ASM option:

$ gcc -o test test.c

$ objdump -d -j .text test

0000000000001129 <main>:
    1129: f3 0f 1e fa           endbr64
    112d: 55                    push   %rbp
    112e: 48 89 e5              mov    %rsp,%rbp
    1131: 89 7d fc              mov    %edi,-0x4(%rbp)
    1134: 48 89 75 f0           mov    %rsi,-0x10(%rbp)
    1138: f3 90                 pause
    113a: b8 00 00 00 00        mov    $0x0,%eax
    113f: 5d                    pop    %rbp
    1140: c3                    retq
    1141: 66 2e 0f 1f 84 00 00  nopw   %cs:0x0(%rax,%rax,1)
    1148: 00 00 00
    114b: 0f 1f 44 00 00        nopl   0x0(%rax,%rax,1)

$ gcc -S test.c

main:
.LFB0:
  .cfi_startproc
  endbr64
  pushq %rbp
  .cfi_def_cfa_offset 16
  .cfi_offset 6, -16
  movq  %rsp, %rbp
  .cfi_def_cfa_register 6
  movl  %edi, -4(%rbp)
  movq  %rsi, -16(%rbp)
#APP
# 4 "hi.c" 1
#   rep; nop

This one had me stumped, but luckily a super awesome friend of mine had a great theory. Objdump was most likely using a synthetic instruction (pause) to replace the rep and nop. Since objdump is interpreting the ELF binary, and gcc is creating assembly from source code, this totally makes sense. One of these days I need to study how gcc and company optimize code. This is a fascinating topic!

This article was posted by on 2022-04-07 00:00:00 -0500 -0500