This can happen for programs that use many data structures with a large number of pointers, since the pointer has 8 bytes on 64-bit, while 4 bytes on 32-bit. The bottleneck in the code for tracking the pointer is cache misses. In the limit where 100% of your code is a chase pointer, you will suffer twice as many misses in the cache in the 64-bit version than in the 32-bit version, therefore, in the case of a 2x slowdown.
For other types of programs, the 64-bit version may be faster than the 32-bit version, at least on x86 / x64. x64 has twice as many general-purpose registers as 32-bit x86, newer instructions, such as SSE / SSE2, will be available on x64, but not on 32-bit x86, and with a lot of addresses you can make different trade-offs between space velocities such as saving instead of recalculating values ββor displaying large memory files.
source share