What is strange, I have never noticed a wandering nop in the asm output in -O0 before. (Perhaps because I do not spend my time looking for an optimized compiler).
Normally, nop internal functions should align branch chains, including function entry points, such as a question related to Brian . (Also see -falign-loops in gcc docs , which is enabled by default at optimization levels other than -Os ).
In this case, nop is part of the compiler noise for an empty empty function:
void go(void) {
See this code in the Godbolt compiler explorer so you can check asm for other versions of the compiler and compilation options.
(Technically, noise, but -O0 includes -fno-omit-frame-pointer , and with -O0 even empty functions set and reset the stack frame.)
Of course, nop absent at any non-zero level of optimization. There is no debugging or performance for this nop in the code in the question. (See links to the performance guide in the x86 tag wiki, esp. Agner Fog microarchitecture guide to find out what code quickly does on current processors.)
I assume this is just an artifact of the gcc internals . This nop present as nop in the gcc -S asm output, and not as a .p2align directive. Gcc itself does not take into account bytes of machine code; it simply uses alignment directives at certain points to align important branch goals. Only the assembler knows how big a nop really necessary to achieve the given alignment.
By default, -O0 tells gcc that you want it to compile quickly and not generate good code. This means that the asm output shows more about gcc components than other -O levels, and very little about how to optimize or something else.
If you are trying to learn asm, it is more interesting to look at the -Og code, for example (optimize for debugging).
If you are trying to understand how well gcc or clang does when creating the code, you should look at -O3 -march=native (or -O2 -mtune=intel or any settings that you build your project with). However, bewilderment of optimizations made in -O3 is a good way to learn some tricks for asm. -fno-tree-vectorize handy if you want to see a non-vectorized version of something fully optimized other than this.