Modulo operation in C #, C and OCaml

I wanted to confirm that the modulo operation was an expensive operation, so I tested this part of the code that checks if the given number is valid:

bool is_even(int n) { return (n & 1) == 0; } 

then this one:

 bool is_even_bis(int n) { return (n % 2) == 0; } 

At first I used C #, and indeed, code using logical & faster than the other, sometimes even three times faster. Using ILSpy, I saw that when compiling to MSIL there was no optimization, the code is exactly the same.

However, as my friend in C noted, using gcc -O3 , the code compiles to:

 is_even: mov eax, DWORD PTR [esp+4] # tmp63, n and eax, 1 # tmp63, xor eax, 1 # tmp63, ret 

and

 is_even_bis: mov eax, DWORD PTR [esp+4] # tmp63, n and eax, 1 # tmp63, xor eax, 1 # tmp63, ret 

So basically the exact same thing. Even when using the -O0 optimization -O0 operation does not even appear:

 is_even: push ebp # mov ebp, esp #, mov eax, DWORD PTR [ebp+8] # tmp63, n and eax, 1 # D.1837, test eax, eax # D.1837 sete al #, D.1838 movzx eax, al # D.1836, D.1838 pop ebp # ret 

Needless to say, the compiled code matches between is_even and is_even_bis in -O0 .

Even more funny, if you can say, another friend of mine tried to use OCaml:

 let is_even x = ((x land 1) == 0) let _ = let i = ref 100000000 in while !i > 0 do ignore (is_even !i); decr i done 

and

 let is_even_bis x = ((x mod 2) == 0) let _ = let i = ref 100000000 in while !i > 0 do ignore (is_even_bis !i); decr i done 

And it looks like the modulo version works faster with bytecode, but slower in native code! Can someone explain this mystery?

Then I began to wonder why it does not behave as it does in C # (where there is an obvious performance gap between the two functions) and why the JIT compiler does not apply the same optimization as gcc . I do not know if there is a way to intercept the output of the JIT compiler, maybe this will help to understand?

Bonus question: I believe that modulo is based on division, and since division is performed in O (n²) time (n is the number of digits), can we say that the module has quadratic time complexity?

+4
source share
1 answer

Firstly, there is no concept of speed for these operations in a figurative sense. Your statements may be true for your system, but they are not valid for all systems. For this reason, this is completely pointless thinking about microoptimization. You can find much more significant optimization by creating a program that solves a significant problem by profiling it to find the parts of the code that take the longest execution time and introduce faster algorithms for those times. Thanks to faster algorithms, I mean better data structures (or fewer operations), unlike different operators. Stop focusing on microoptimizations!

Your version of C is_even is undefined. This can lead to negative zeros or traps, especially for negative numbers. Using a trap view is undefined behavior.

It seems that the difference you could see could be caused by the signed integer representation in your system. Consider if -1 should be represented using the complement 11111111...11111110 . You would expect -1 % 2 result in -1, not 0, would you? (edit: ... but what would you expect -1 & 1 result in if -1 is represented as 11111111...11111110 ?). There must be some overhead to handle this for implementations that use them as an integer sign.

Your C compiler may have noticed that the % expression you used and the & expression you used are equivalent on your system and, as a result, did this optimization, but the optimization was not performed by C # or OCaml for any reason.

Bonus question: I believe that the module is based on separation, and since division is performed in O (n²) time (n is the number of digits), can we say that the module has quadratic time complexity?

It makes no sense to consider the time complexity of these two basic operations, since they will differ from system to system. I reviewed this in the first paragraph.

+2
source

All Articles