Last month I started a course that teaches you how to write your own Operating System. Working at the intersection of hardware and software (X86 Assembly and C) has been incredibly rewarding. I’ve learned a TON! One interesting thing I came across in the Linux kernel’s bootloader code is the use of “asm volatile”. Here is a snippet from $SRC_DIR/linux-5.16/arch/x86/boot/boot.h:
#define cpu_relax() asm volatile("rep; nop")
The history behind this is super interesting, and it’s used to force the compiler’s optimizer to execute the code AS IS. Being somewhat curious, I wrote a simple C program to see this in action:
#define cpu_relax() asm volatile("rep; nop")
int main(int argc, char **argv) {
cpu_relax();
}
When I compiled it, I was a bit surprised that the instructions above didn’t show up verbatim in the objdump output, but they did when the binary was compiled with the gcc create ASM option:
$ gcc -o test test.c
$ objdump -d -j .text test
0000000000001129 <main>:
1129: f3 0f 1e fa endbr64
112d: 55 push %rbp
112e: 48 89 e5 mov %rsp,%rbp
1131: 89 7d fc mov %edi,-0x4(%rbp)
1134: 48 89 75 f0 mov %rsi,-0x10(%rbp)
1138: f3 90 pause
113a: b8 00 00 00 00 mov $0x0,%eax
113f: 5d pop %rbp
1140: c3 retq
1141: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
1148: 00 00 00
114b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
$ gcc -S test.c
main:
.LFB0:
.cfi_startproc
endbr64
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
#APP
# 4 "hi.c" 1
# rep; nop
This one had me stumped, but luckily a super awesome friend of mine had a great theory. Objdump was most likely using a synthetic instruction (pause) to replace the rep and nop. Since objdump is interpreting the ELF binary, and gcc is creating assembly from source code, this totally makes sense. One of these days I need to study how gcc and company optimize code. This is a fascinating topic!