Is using double faster than float?

Double values ​​retain higher accuracy and double the size of the float, but are Intel processors optimized for floating?

That is, double operations are just as fast or faster than float operations for +, -, * and /?

Does the answer change for 64-bit architectures?

+50
c ++ intel osx-snow-leopard
Aug 06 '10 at 17:23
source share
7 answers

There is more than one “Intel processor”, especially in terms of which operations are optimized for others !, but most of them at the CPU level (in particular, in FPUs) are such that the answer to your question:

- double operations are just as fast or faster than float operations for +, -, * and /?

"yes" is inside the CPU . However, when doubling the memory capacity for each number, a heavier cache load and more memory bandwidth are clearly implied for filling and spilling these cache lines from / to RAM; the time when you care about the performance of a floating point operation is when you perform many such operations, so memory and cache considerations are crucial.

@ Richard's answer indicates that there are also other ways to perform FP operations (instructions

In the end, you need to navigate, but my prediction is that for reasonable (i.e., large ;-) benchmarks, you will find the advantage of sticking to the same accuracy (assuming, of course, that you don't need extra bits of accuracy! -) .

+59
Aug 6 '10 at 17:33
source share

If all floating point calculations are performed inside the FPU, then no, there is no difference between double and float calculations, since floating point operations are actually performed with 80 bits of precision on the FPU stack. FPU stack entries are rounded in accordance with the need to convert an 80-bit floating point format to a double or float format. Moving sizeof(double) bytes to / from RAM compared to sizeof(float) bytes is the only difference in speed.

If, however, you have a vectorized calculation, you can use the SSE extensions to perform four float calculations at the same time as two double calculations. Therefore, the clever use of SSE instructions and XMM registers can allow higher throughput in calculations that use only float s.

+21
Aug 6 '10 at 18:00
source share

I just want to add to the already wonderful answers that the __m256? family __m256? with the same instruction-multiple data ( SIMD ) C ++ internal functions work in parallel with 4 double (for example, _mm256_add_pd ) or 8 float (for example, _mm256_add_ps ).

I'm not sure if this can lead to actual speed, but it seems that you can handle 2x as many floats per instruction when using SIMD.

+9
Oct 14 '12 at 1:35
source share

In experiments with the addition of 3.3 for 2,000,000,000 times the results:

 Summation time in s: 2.82 summed value: 6.71089e+07 // float Summation time in s: 2.78585 summed value: 6.6e+09 // double Summation time in s: 2.76812 summed value: 6.6e+09 // long double 

So double is faster and the default in C and C ++. It is more portable and is used by default in all library functions of C and C ++. Alos double has significantly higher accuracy than floating.

Even Stroustrup recommends a double float:

“The exact meaning of one-, two- and extended accuracy is determined by the implementation. Choosing the right accuracy for a problem where the choice matters requires a significant understanding of floating point calculations. If you don’t have this understanding, get tips, take time to learn, or use double and hope for the best. "

Perhaps the only case where you should use float instead of double is 64-bit hardware with modern gcc. Since float is less; double is 8 bytes, and float is 4 bytes.

+8
Mar 18 '12 at 18:20
source share

Another point to consider is the use of a graphics adapter (GPU). I work with a project that is numerically intense, but we don’t need a choice that offers a double option. We use GPU cards to speed up processing. The CUDA GPU needs a special package to support double, and the amount of local RAM on the GPU is quite fast, but rather scarce. As a result, using float also doubles the amount of data that we can store on

Another point is memory. Floats take up half as much as RAM. If you are dealing with VERY large data sets, this can be a very important factor. If you use double means, you need to cache the disk against a clean bar, your difference will be huge.

So, for the application I'm working with, the difference is very important.

+7
Aug 06 2018-10-06T00:
source share

The only really useful answer: only you can tell. You need to compare your scripts. Small changes in command and memory patterns can have a significant impact.

This will make a difference if you use FPU or SSE hardware (the former does all its work with an expanded accuracy of 80, so the double will be closer, and later 32-bit, i.e. floating).

Update: s / MMX / SSE / as indicated in another answer.

+5
Aug 6 '10 at 17:27
source share

A floating point is usually an extension of one general purpose CPU. Therefore, the speed will depend on the hardware platform used. If the platform supports floating point, I will be surprised if there is any difference.

+2
Aug 6 '10 at 17:33
source share



All Articles