Thus, every time a function is called, the processor must discard instructions in the pipeline.
No, everything after the decoding phase is still good. The CPU does not continue after the decode unconditional branch (e.g. jmp, callor ret). Only commands that have been extracted but not yet decoded are those that should not be executed. Until the target address is decoded from the instruction, nothing is useful for starting the pipeline, so you get bubbles in the pipeline until the target address is known. Instructions for deciphering branches minimize the penalty for received branches as early as possible.
RISC IF ID EX MEM WB (, , , mem, ( )). , , , IF, , ID ( ).
"" - , . . (, , .)
I- I1, , IF- . I-cache . .
, , . , .
, "Branch Target Buffer" . , , , . BTB , ( ).
ret : , . . x86 , /ret. . call label/label: pop ebx 32- , EIP EBX. 15 ret .
, , , x86.
Agner Fog pdf, , x86 ( . x86 tag wiki) , RISC.
( , / ) . , .
, , ( I-cache).
, , , , . ( x86-64 SystemV, float/vector 8 .) , , . , / , .
, , , , , - , . link-time. , , .
, , ?
static, .
int foo(void) { return 1; }
mov eax, 1 #,
ret
int bar(int x) { return foo() + x;}
lea eax, [rdi+1] # D.2839,
ret
@harold, inlining , , .
Intel SnB- , uop-, . 1536 uops IIRC, 6 uops . uop 19 15 , IIRC (- , , , - uarch). , . , .