Why does the assigned function pointer work worse than the branch?

Question

Why does the assigned function pointer work worse than the branch?

I have a class with enum variable. One of the member functions bases its behavior on this enum so that the “possible” optimization, I have two different behaviors as two different functions, and I give the class a pointer to the member function that is set during construction. I modeled this situation as follows:

 enum catMode {MODE_A, MODE_B}; struct cat { cat(catMode mode) : stamp_(0), mode_(mode) {} void update() { stamp_ = (mode_ == MODE_A) ? funcA() : funcB(); } uint64_t stamp_; catMode mode_; }; struct cat2 { cat2(catMode mode) : stamp_(0), mode_(mode) { if (mode_ = MODE_A) func_ = funcA; else func_ = funcB; } void update() { stamp_ = func_(); } uint64_t stamp_; catMode mode_; uint64_t (*func_)(void); };

And then I create a cat object and an array of length 32 . I move the array to cache it, then I call the cats update method 32 and save the delay using rdtsc in the array ...

Then I call a function that loops several hundred times using rand() , ulseep() and some arbitrary strcmp() .. strcmp() back and I do 32 again.

As a result, a method with a branch will always be around 44 +/- 10 loops, while a method with a function pointer tends to be around 130 . I am curious why this will be so?

Anyway, I would expect the same performance. In addition, templates are hardly possible, because the full specialization of the real cat class for this one function will be redundant.

+4

c ++ performance optimization function-pointers

Palace chan Jul 25 '12 at 14:28

source share

1 answer

Mysticial · Accepted Answer · 2012-07-25T15:31:28+0000

Without a full SSCCE, I cannot approach this as usual with such questions.
So the best I can do is to suggest:

The main difference between your two cases is that you have a branch or a pointer to a function. The fact that you see the difference in all the strong hints that funcA() and funcB() are very small functions.

Opportunity number 1:

In the branch version, the code funcA() and funcB() are built into the compiler. This not only skips the function call overhead, but if the functions are trivial enough, the branch can also be fully optimized.

Function pointers, on the other hand, cannot be inlined unless the compiler can solve them at compile time.

Opportunity number 2:

Comparing a branch with a function pointer, you put the predictor branch in the predictor of the target branch .

The limiting branch prediction does not match the branch prediction. In the event of branching, the processor must predict which path to branch. In the case of a function pointer, it needs to predict where to insert it.

It is very likely that your processor branch predictor is much more accurate than the target branch predictor. But then again, this is all speculation ...

Why does the assigned function pointer work worse than the branch?

More articles: