When I compile this with a reasonable set of parameters (specifically -O3), this is what I get:
For f() :
.type _Z1fi, @function _Z1fi: .LFB0: .cfi_startproc .cfi_personality 0x3,__gxx_personality_v0 cmpl $1, %edi sbbl %eax, %eax andb $58, %al addl $99, %eax ret .cfi_endproc
For g() :
.type _Z1gb, @function _Z1gb: .LFB1: .cfi_startproc .cfi_personality 0x3,__gxx_personality_v0 cmpb $1, %dil sbbl %eax, %eax andb $58, %al addl $99, %eax ret .cfi_endproc
They still use different instructions for comparison ( cmpb for boolean vs. cmpl for int), but the rest of the body is identical. A quick look at Intel manuals tells me: ... not much. There is no such thing as cmpb or cmpl . They are all cmp , and I cannot find the time tables at the moment. I assume, however, that there is no difference in frequencies between byte matching and long immediate comparison, so the code is identical for all practical purposes.
edited to add the following based on your addition
The reason that the code in a non-optimized case is different from the other is because it is not optimized. (Yes, itโs circular, I know.) When the compiler walks around AST and generates the code directly, it doesnโt โknowโ anything except in the immediate vicinity of AST. At this moment, he lacks all the necessary contextual information to know that at this particular point he can consider the declared bool type as int . A boolean is obviously considered a byte by default, and when manipulating bytes in the Intel world, you should do things like sign-extend to push it to a certain width, to push it on the stack, etc. (You cannot push bytes.)
When the optimizer looks at the AST and does its magic, it looks at the surrounding context and โknowsโ when it can replace the code with something more efficient without changing the semantics. Therefore, he โknowsโ that he can use an integer in the parameter and thereby lose unnecessary conversions and expansion.