Understanding the magic number 0x07EFEFEFF used to optimize strlen

I came across this answer regarding the use of the magic number 0x07EFEFEFF used for optimization strlen, and here is what the top answer says:

Look at the magic beats. Bits No. 16, 24 and 31: 1. The 8th bit is 0.

  • The 8th bit represents the first byte. If the first byte is not equal to zero, the 8th bit becomes 1at this point. Otherwise it is 0.
  • The 16th bit represents the second byte. The same logic.
  • The 24th bit represents the third byte.
  • The 31st bit represents the fourth byte.

However, if I compute result = ((a + magic) ^ ~a) & ~magicwith a = 0x100, I find that result = 0x81010100, which means that according to the upper responder, the second byte ais 0, which is obviously wrong.

What am I missing?

Thank!

+4
source share
1 answer

The bits are indicated only if the byte is equal to zero, if the lower bytes are not equal to zero, so it can indicate only FIRST 0 bytes, but not bytes after the first 0.

  • bit8 = 1 means that the first byte is zero. Other unknown bytes
  • bit8 = 0 means the first byte is nonzero
  • bit8 = 0 and bit16 = 1 means that the second byte is zero, higher bytes are unknown
  • bit8 = 0 and bit16 = 0 mans the first two bytes are nonzero.

In addition, the last bit (bit31) reports only 7 bits of the last byte (and only if the first 3 bytes are not equal to zero) - if this is the only bit, then the last byte is 0 or 128 (and the rest are not equal to zero).

+4

All Articles