Is the x87 FP stack up to date?

I notice that compilers generate code that targets SIMD registers every time double arithmetic is used. This applies to non-optimized as well as optimized code. Does this mean that the x87 FP unit can be considered obsolete and present only for backward compatibility?

I also notice that other "popular" platforms also rely on their respective SIMD implementations, rather than on FPs created as a stack.

Also, the SIMD implementation has a width of at least 128 bits, so I think, does this mean that the (internal) accuracy of operations is higher than for the x87 FP block?

I also wonder about performance, bandwidth, and latency, given that the SIMD was conceived with respect to the vector, so I wonder how they work with scalars.

+7
assembly stack floating-point obsolete x87
source share
1 answer

Also, the SIMD implementation has a width of at least 128 bits, so I think, does this mean that the (internal) accuracy of operations is higher than for the x87 FP block?

The width of the SIMD register is not the width of one single component of the vector that it represents. The widely available SIMD instruction sets offer a maximum of two binary formats, IEEE 754 (64-bit). This is not as good as the historic 80-bit extended format for accuracy or range.

Many C compilers make the 80-bit format available as a long double type. I often use it. It is useful for most intermediate calculations: using this helps make the final result more accurate, even if the final result is returned as a binary64 double . One example is the function in this question , for which the mathematically intuitive property of the final result takes place if intermediate calculations are performed using long double , but not if intermediate calculations are performed with the same double type as inputs and outputs.

Similarly, among the many restrictions that should have been balanced when choosing options for the extended 80-bit format, one consideration is that it is ideal for computing the binary64 pow() function by composing 80-bit expl() and logl() . Extra precision is needed to get good accuracy for the final result.

However, I should note that when the “intermediate” calculations are one basic operation, it is better not to go through the extended precision. In other words, when x and y are of type double , the precision (double)(x * (long double)y) very slightly worse than the precision x * y . These two expressions almost always give the same results, and in rare cases when they differ, x * y very slightly more accurate. This phenomenon is called double-rounding .

+11
source share

All Articles