Quick summary: in x86-64 mode, are far transitions as slow as in x86-32 mode?
In the x86 processor, jumps are divided into three types:
- short, with PC offset +/- 127 bytes (2 byte instructions)
- next to the offset +/- 32k, which "collapses" the current segment (3 byte instruction)
- far that can jump anywhere (5 byte instruction)
short and close jumps take 1-2 cycles, and long jumps take 50-80 cycles, depending on the processor. This comes from my reading of the documentation because they "go beyond CS, the current code segment."
In x86-64 mode, code segments are not used. A segment is actually always 0..infinity. Ergo, there should be no penalty for going beyond the segment.
Thus, the question arises: does the number of clock cycles for the long jump change if the processor is in x86-64 mode?
A related issue with the bonus: most * nix-like operating systems running in 32-bit protected mode explicitly set the segment sizes to 0..infection and control the linear β physical translation completely through the page tables. Do they benefit from this in terms of call times (fewer clock cycles), or is it really the processorβs internal legacy of size segment registers since 8086?
source share