Are 2 ^ n exponential calculations really less efficient than bit shifts?

Question

Are 2 ^ n exponential calculations really less efficient than bit shifts?

if a:

int x = 4; pow(2, x);

Is this really much less effective than just:

 1 << 4

?

+8

c objective-c

patrick Sep 23 '12 at 10:43

source share

4 answers

Yes. Although, as far as I can not say. The easiest way to determine this is to compare it.

The pow function uses doubles ... At least if it conforms to the C standard. Even if this function used a bit shift, when it sees base 2 , there will still be testing and branching to achieve this output, and by that time your simple bit record It will be completed. And we have not yet considered the overhead of calling a function.

For equivalence, I assume that you used 1 << x instead of 1 << 4 .

Perhaps the compiler can optimize both of them, but much less often optimize the pow call. If you need the fastest way to calculate power 2, do it with a shift.

Update ... Since I mentioned that this is easy to compare, I decided to do just that. I happen to have Windows and Visual C ++, so I used this. The results will be different. My program:

 #include <Windows.h> #include <cstdio> #include <cmath> #include <ctime> LARGE_INTEGER liFreq, liStart, liStop; inline void StartTimer() { QueryPerformanceCounter(&liStart); } inline double ReportTimer() { QueryPerformanceCounter(&liStop); double milli = 1000.0 * double(liStop.QuadPart - liStart.QuadPart) / double(liFreq.QuadPart); printf( "%.3f ms\n", milli ); return milli; } int main() { QueryPerformanceFrequency(&liFreq); const size_t nTests = 10000000; int x = 4; int sumPow = 0; int sumShift = 0; double powTime, shiftTime; // Make an array of random exponents to use in tests. const size_t nExp = 10000; int e[nExp]; srand( (unsigned int)time(NULL) ); for( int i = 0; i < nExp; i++ ) e[i] = rand() % 31; // Test power. StartTimer(); for( size_t i = 0; i < nTests; i++ ) { int y = (int)pow(2, (double)e[i%nExp]); sumPow += y; } powTime = ReportTimer(); // Test shifting. StartTimer(); for( size_t i = 0; i < nTests; i++ ) { int y = 1 << e[i%nExp]; sumShift += y; } shiftTime = ReportTimer(); // The compiler shouldn't optimize out our loops if we need to display a result. printf( "Sum power: %d\n", sumPow ); printf( "Sum shift: %d\n", sumShift ); printf( "Time ratio of pow versus shift: %.2f\n", powTime / shiftTime ); system("pause"); return 0; }

My conclusion:

 379.466 ms 15.862 ms Sum power: 157650768 Sum shift: 157650768 Time ratio of pow versus shift: 23.92

+3

paddy Sep 23 '12 at 22:49

source share

In the general case, yes, since the bit shift is a very simple operation for the processor.

On the other hand, many compilers optimize the code, so the increase in power is actually just a bit biased.

+1

Michał Miszczyszyn Sep 23 '12 at 22:48

source share

It depends on the compiler, but in general (when the compiler is not completely braindead) yes, the shift is one processor instruction, the other is a function call, which includes saving the current state in setting the stack frame, which requires a lot of instructions.

0

Daniel Fischer Sep 23 '12 at 22:45

source share

James · Accepted Answer · 2012-09-24T01:23:40+0000

Yes. An easy way to show this is to compile the following two functions that do the same thing, and then look at the disassembly.

 #include <stdint.h> #include <math.h> uint32_t foo1(uint32_t shftAmt) { return pow(2, shftAmt); } uint32_t foo2(uint32_t shftAmt) { return (1 << shftAmt); }

cc -arch armv7 -O3 -S -o - shift.c (It seems to me that ARM asm is easier to read, but if you want x86 to just remove the arch flag)

  _foo1: @ BB#0: push {r7, lr} vmov s0, r0 mov r7, sp vcvt.f64.u32 d16, s0 vmov r0, r1, d16 blx _exp2 vmov d16, r0, r1 vcvt.u32.f64 s0, d16 vmov r0, s0 pop {r7, pc} _foo2: @ BB#0: movs r1, #1 lsl.w r0, r1, r0 bx lr

You can see that foo2 accepts only two instructions vs foo1 , which accepts several instructions. It should move the data to FP HW registers ( vmov ), convert the integer to float ( vcvt.f64.u32 ), call the exp function, and then convert the response back to uint ( vcvt.u32.f64 ) and move from FP HW to registers GP.

Are 2 ^ n exponential calculations really less efficient than bit shifts?

More articles: