On some hardware, arithmetic involving double values โโmay take longer than single values, but the latest FPUs have one proprietary data type (e.g. 80-bit extended floating point values โโfor x86) that will be used internally for calculations regardless what type of data in memory you are using. So, to say that "FPU computing will be faster with single precision" is usually not the reason for using single-precision on most modern hardware today.
However, in addition to the โuses less memoryโ reasons described in other answers, there is a very practical reason when it comes to vector SIMD instructions such as SSE and AltiVec. Single precision is twice as fast as double precision, because the instructions work on fixed-size vectors, and you can write twice as many single-precision values โโinto one vector, and the processing time usually remains the same.
For example, with a 128-bit vector unit capable of processing vector multiplications in 2 clock cycles, you can get the bandwidth from two single precision multiplications per cycle compared to 1 double precision, since you can 4 singles in a vector, against two two-local ones.
A similar effect occurs with memory bandwidth and is not specific to vector processing - if you have large arrays of doubles, they not only take up twice as much space, but can take up twice as much time to process when your algorithm is limited with bandwidth (which is becoming more likely given the increase in size and decrease in latency of vector processing units).
BeeOnRope
source share