Is there a performance limit for an array of 32-bit integers in x86-64?

Sorry if the question sounds silly. I only vaguely realize the problem of data alignment and have never done any 64-bit programs. I am currently working on 32-bit x86 code. It often accesses an int array. Sometimes a single 32-bit integer is read. Sometimes two or more are read. At some point, I would like to make the code 64-bit. I'm not sure if I should declare this int array as int or long int . I would rather keep the width of the integer the same, so I don't need to worry about the differences. I'm a little worried that reading / writing off an address that doesn't match the natural word may be slow.

+7
source share
4 answers

Non-compliance limits arise only when the load or magazine crosses the leveling line. The border is usually smaller:

  • The natural size of the word hardware. (32-bit or 64-bit *)
  • The size of the data type.

If you load a 4-byte word in 64-bit (8-byte) architecture. It should not be aligned by 8 bytes. It should be only 4 bytes.

Similarly, if you download a 1-byte char on any machine, it does not need to be aligned at all.

* Please note that SIMD vectors may mean a larger natural word size. For example, a 16-byte SSE still requires 16-byte alignment on both x86 and x64. (prohibition of explicit inconsistent loads / storages)


In short, you do not need to worry about data alignment. The language and the compiler are trying very hard to prevent you from worrying about this.

So just stick to whatever type of data is best for you.

+7
source

X86 64-bit processors are still heavily optimized to work efficiently with 32-bit values. Even on 64-bit operating systems, access to 32-bit values ​​is no less fast than access to 64-bit values. In practice, this will actually be faster, since it consumes less cache space and memory bandwidth.

+3
source

There is a lot of good information here: 32-bit performance and 64-bit arithmetic

Even more information https://superuser.com/questions/56540/32-bit-vs-64-bit-systems , which claims to have seen the worst decline of 5% (in terms of application and not individual operations )

The short answer is no, you won’t get a performance hit.

+1
source

Whenever you access any memory location, the entire cache line is read into L1 cache, and any subsequent access to anything on this line is performed as quickly as possible. If your 32-bit access does not cross the cache line (which will not be the case if it is executed with 32-bit alignment), it will be as fast as 64-bit access.

+1
source

All Articles