Gcc division optimization

Here is some code (the full program follows later in the question):

template <typename T> T fizzbuzz(T n) { T count(0); #if CONST const T div(3); #else T div(3); #endif for (T i(0); i <= n; ++i) { if (i % div == T(0)) count += i; } return count; } 

Now, if I call this template function using int , then I get a 6 times difference coefficient depending on whether I define CONST or not:

 $ gcc --version gcc (GCC) 3.4.4 (cygming special, gdc 0.12, using dmd 0.125) $ make -B wrappedint CPPFLAGS="-O3 -Wall -Werror -DWRAP=0 -DCONST=0" && time ./wrappedint g++ -O3 -Wall -Werror -DWRAP=0 -DCONST=0 wrappedint.cpp -o wrappedi nt 484573652 real 0m2.543s user 0m2.059s sys 0m0.046s $ make -B wrappedint CPPFLAGS="-O3 -Wall -Werror -DWRAP=0 -DCONST=1" && time ./wrappedint g++ -O3 -Wall -Werror -DWRAP=0 -DCONST=1 wrappedint.cpp -o wrappedi nt 484573652 real 0m0.655s user 0m0.327s sys 0m0.046s 

Studying the disassembly shows that in the fast (const) case, the module was converted into a thing like multiplication and shift, while in the slow (non-constant) case it uses idivl .

Worse, if I try to wrap an integer in a class, then optimization will not happen if I use const or not. The code always uses idivl and runs slowly:

 #include <iostream> struct WrappedInt { int v; explicit WrappedInt(const int &val) : v(val) {} bool operator<=(const WrappedInt &rhs) const { return v <= rhs.v; } bool operator==(const WrappedInt &rhs) const { return v == rhs.v; } WrappedInt &operator++() { ++v; return *this; } WrappedInt &operator+=(const WrappedInt &rhs) { v += rhs.v; return *this; } WrappedInt operator%(const WrappedInt &rhs) const { return WrappedInt(v%rhs.v); } }; std::ostream &operator<<(std::ostream &s, WrappedInt w) { return s << wv; } template <typename T> T fizzbuzz(T n) { T count(0); #if CONST const T div(3); #else T div(3); #endif for (T i(0); i <= n; ++i) { if (i % div == T(0)) count += i; } return count; } int main() { #if WRAP WrappedInt w(123456789); std::cout << fizzbuzz(w) << "\n"; #else std::cout << fizzbuzz<int>(123456789) << "\n"; #endif } 

My questions:

1) Is there a simple principle of C ++ itself or gcc-optimization that explains why this happens, or is it just a case of "executing various heuristics, is this the code you get"?

2) Is there a way to make the compiler understand that my locally declared and never referencing const WrappedInt can be considered as a const value for compilation? I want this thing to be a direct replacement for int in templates.

3) Is there a known method of wrapping int so that the compiler can refuse packaging during optimization? The goal is that WrappedInt will be a political template. But if the policy "does nothing" leads to essentially arbitrary penalties for the 6-fold value compared to int, I better get rid of this situation and use int directly.

+9
c ++ optimization gcc
Jul 13 '09 at 20:16
source share
4 answers

I guess this is just a very old version of GCC that you work on. The oldest compiler that I have on my machine, gcc-4.1.2, executes a quick path with both non-constant and wrappers (and it only does -O1).

+6
Jul 13 '09 at 21:08
source share

Try combining const int v in your WrappedInt class with const T in your fizzbuzz function and see if the compiler can optimize this.

By declaring const int , you created a special case - the compile-time constant. The compiler knows what a value is, and can optimize it more strongly than a value that can change during program startup.

+1
Jul 13 '09 at 20:30
source share

Is there a known way to wrap ints in such a way that the compiler can refuse packaging during optimization?

Try passing WrappedInt by value. Then WrappedInt can be passed to registers. The pass-by-const parameter sometimes causes gcc to return to the stack to pass arguments.

On slowing down int vs const int , I can only assume that GCC is trying to play safely in the face of aliasing. Basically, if he cannot prove that the div not an alias for another, more accessible variable, he cannot turn it into a constant. If you declare it const, GCC assumes that it will not be smoothed and will be converted to a constant. Besides idivl , you should also see a memory sample once upon entering the function, and not the immediate values ​​used for the const case.

0
Jul 13 '09 at 20:41
source share

The difference in speed is caused by the compiler, not knowing if the value will change the value of "div". When it is not const, it treats it as a passed variable. It can be anything, and therefore the compiler will use an instruction that shares two variables - idivl. When you say that it is const, the compiler can handle it exactly as if you typed:

 if (i% 3 == 0)

I am surprised that he did not use bitwise AND (&).

WrappedInt is not optimized because, well, it is not int. His class.

Something you could do is enable fizzbuzz in WrappedInt.

0
Jul 14 '09 at 3:10
source share



All Articles