Microoptimization: using intptr_t for flag / bool types

From what I understand, the definition of intptr_t depends on the architecture - it is guaranteed that it is able to represent a pointer that can access all the single address spaces of the process.

Nginx (a popular open source web server) defines a type that is used as a flag (boolean), and this is typedef for intptr_t . Now, using the x86-64 architecture as an example, which has access to many instructions covering operands of all sizes, why define a flag as intptr_t? Of course, the tradition of using the 32-bit bool type would also fit the bill?

I switched over 32 bit Vs. The 8-bit argument bools itself when I was a new developer, and it was concluded that 32-bit bools work better for the normal case due to the complexity of the processor design. Why then need to switch to 64-bit bools?

+4
source share
2 answers

The only people who really know why nginx uses intptr_t for the boolean type are nginx developers.

As you say, 32-bit bools often work better than 8-bit bools for the normal case. I did not test myself, but for me it is not so unreasonable that for a certain situation on x86-64 the 64-bit bool is superior to the 32-bit bool. For example, in the nginx source, I noticed that most ngnx_flag_t are found in structures with other types (u)intptr_t typedef'ed. A 32-bit bool may not save space here due to alignment.

I find the choice for intptr_t bit odd, as this is an optional C99 type for the purpose of converting to / from void * . But as far as I can see, it is never used as such. Perhaps this type gives the best approximation for the "native" type of word size?

+2
source

The 64-bit bool sounds like a terrible idea for x86-64. I assume that the one who wrote it was thinking about 32-bit machines with 32-bit pointers at that time.

Modern x86 has very good support for unbalanced loads / storages and for unpacking bytes to fill the register on the fly. If x86 is the main goal, you need to use 8 bits of boolean, esp in cases where it saves bytes, which leads to less use of the cache. In the rare case when the cache is not a problem, 32 bits is a natural size and may save instructions in some cases when a logical value is added or multiplied directly with int , allowing you to use the logical value as a memory operand instead of loading from using movzx .

For the usual case of test branches on the logical processor, Intel and AMD have literally zero performance difference between 8-bit and 32-bit operands, regardless of whether they are in memory or in register.

0
source

All Articles