To summarize what was said in the comments: 0x38e38e39 - 954437177 in decimal value, which is equal to (2^33 + 1) / 9 . So, the build code works this way (I replaced (5 * fahr - 160) with X for replacement):
mov $0x38e38e39,%edx /* edx is now 0x38e38e39 == (2^33 + 1) / 9 */ mov %ecx,%eax /* eax is now X */ imul %edx /* edx:eax is now eax * edx == X * ((2^33 + 1) / 9) */
What the fun part begins there. edx:eax denotes the 1-operand of imul , which first fills its operand ( edx in this case) with 32 bits, and then puts the remaining least significant bits in eax .
Effectively we get a 64-bit result through two registers! It looks something like this:
edx - 32 least significant bits (X * ((2^33 + 1) / 9)) >> 32 .
eax is (X * ((2^33 + 1) / 9)) % 2^32 (but it will be dropped soon)
Then we get this substance in the form:
sar %edx /* edx is now edx >> 1 == (X * ((2^33 + 1) / 9)) >> 33 */ mov %ecx,%eax /* eax is now X again */ sar $0x1f,%eax /* eax is now X >> 0x1f == X >> 31 */ mov %edx,%ecx /* ecx is now (X * ((2^33 + 1) / 9)) >> 33 */
So now ecx is the 32 least significant bits (X * ((2^33 + 1) / 9)) >> 33 and eax is X >> 31 , that is, the 32-digit bit "-s X (which is 32-bit integer) which are 0 if X non-negative and -1 if X negative.
EDIT: detailed study of the special case of negative X
Now a little about what happens to negative numbers. An important part of ecx is that it is actually the 32 most significant bits of X * ((2^33 + 1) / 9) .
As I hope you remember, in binary terms, negating a number means inverting all its bits and adding 1 to it. And when we add 1 , we invert lsb to 1 if it was 0 , otherwise we invert it and all the bits after it until we find the first 0 , and then invert it.
So, what happens when we try to deny (X * ((2^33 + 1) / 9)) (or, what is the same thing, what we get if we perform calculations with -X , considering X positive for this example )? Of course, first we invert all its bits, then add 1 to it. But for the latter (adding 1 to it), in order to affect the 32 most signifying bits of the number, the 32 least significant bits would have to be 0xFFFFFFFF . And (believe me, on this) there is no 32-bit integer which when multiplied by 0x38e38e39 gives such a result.
So efficient, while (-X * ((2^33 + 1) / 9)) == -(X * ((2^33 + 1) / 9)) it differs from the 32 most significant bits: ((-X * ((2^33 + 1) / 9)) >> 33) & 0xFFFFFFFF != -(((X * ((2^33 + 1) / 9)) >> 33) & 0xFFFFFFFF) .
Instead, the 32 most significant bits (-X * ((2^33 + 1) / 9)) are the bitwise negation of the 32 most significant bits (X * ((2^33 + 1) / 9)) : ((-X * ((2^33 + 1) / 9)) >> 33) & 0xFFFFFFFF != ~(((X * ((2^33 + 1) / 9)) >> 33) & 0xFFFFFFFF) .
Tl; dr for negative case X : the ecx value for -X will be equal to the bitwise negation of the ecx value for X We do not want this. So, to get the correct results for negative X values, we need to add 1 to ecx (or, which is the same thing, subtract -1 ):
sub %eax,%ecx
Then comes the last part:
mov %ecx,%eax mov %eax,-0x10(%rbp)
I'm sorry if I messed up something, it's hard for me to keep my head around asm written in GAS syntax, but I hope you understand this idea.
Tl; dr: the trick here is multiplied by the inverse, multiplied by a large number, discarding the large number with an arithmetic shift, and then rounding the result to zero if it is negative.
Why is everyone worried?
As a result, we endowed the division into 10 cycles (given that imul only takes one cycle). Given that idiv can take almost twice as many cycles (11 to 18, as Hans Passant mentioned in this , answering a similar question), this approach can make a huge performance advantage.