Inconsistent results with exAllArithmeticExceptions in Win32 and Win64

My colleague picked up a mismatch between Win32 and Win64 code compiled by Delphi in the way it handles NaN. Take the following code as an example. When compiling in 32 bit, we do not receive messages, but when compiling with 64-bit, we get that both comparisons are correct.

program TestNaNs; {$APPTYPE CONSOLE} {$R *.res} uses System.SysUtils, System.Math; var nanDouble: Double; zereDouble: Double; nanSingle: Single; zeroSingle: Single; begin SetExceptionMask(exAllArithmeticExceptions); nanSingle := NaN; zeroSingle := 0.0; if nanSingle <> zeroSingle then WriteLn('nanSingle <> zeroSingle'); nanDouble := NaN; zereDouble := 0.0; if nanDouble <> zereDouble then WriteLn('nanDouble <> zeroDouble'); ReadLn; end. 

My understanding of the IEEE standard is that <> should return true, but all other operations should return false. So in this case, it looks like the 64-bit version is correct and the 32-bit version is wrong. The code generated by both is very different from the 64-bit version generating the SSE code.

For 32 bits:

 TestNaNs.dpr.21: if nanSingle <> zeroSingle then 0041A552 D905E01E4200 fld dword ptr [$00421ee0] 0041A558 D81DE41E4200 fcomp dword ptr [$00421ee4] 0041A55E 9B wait 0041A55F DFE0 fstsw ax 0041A561 9E sahf 0041A562 7419 jz $0041a57d 

and for 64 bits:

 TestNaNs.dpr.21: if nanSingle <> zeroSingle then 000000000042764E F3480F5A05C9ED0000 cvtss2sd xmm0,qword ptr [rel $0000edc9] 0000000000427657 F3480F5A0DC4ED0000 cvtss2sd xmm1,qword ptr [rel $0000edc4] 0000000000427660 660F2EC1 ucomisd xmm0,xmm1 0000000000427664 7A02 jp Project63 + $68 0000000000427666 7420 jz Project63 + $88 

My question is that. Is this a problem with the Delphi compiler or a reservation with Intel processors?

+8
delphi delphi-10-seattle
source share
1 answer

The IEEE 754 standard defines arithmetic formats, operations, rounding rules, exceptions, etc. to calculate floating point. The Delphi compiler implements floating point arithmetic on top of the available hardware units. For a 32-bit Windows compiler, this is an x87 block, and for a 64-bit Windows compiler, this is an SSE block. Both of these devices comply with the IEEE 754 standard.

The difference that you observe arises at the level of language implementation. Let's look at the two versions in more detail.

32-bit Windows compiler

The comparison operator is compiled as follows:

 TestNaNs.dpr.19: if nanDouble <> zeroDouble then
 0041C4C8 DD05C03E4200 fld qword ptr [$ 00423ec0]
 0041C4CE DC1DC83E4200 fcomp qword ptr [$ 00423ec8]
 0041C4D4 9B wait 
 0041C4D5 DFE0 fstsw ax
 0041C4D7 9E sahf 
 0041C4D8 7419 jz $ 0041c4f3

The Intel Software Developer Guide says that an unordered comparison is indicated by the flags C3, C2, and C0 set to 1. The full table is here:

 Condition C3 C2 C0
 ST (0)> Source 0 0 0
 ST (0) <Source 0 0 1
 ST (0) = Source 1 0 0
 Unordered 1 1 1

When you test the FPU under the debugger, you can see that this is our case.

 0041C4D5 DFE0 fstsw ax
 0041C4D7 9E sahf 
 0041C4D8 7419 jz $ 0041c4f3

This transfers the various bits from the FPU status register to the CPU flags, see the manual for exact details on which flags go there. Branch is completed if ZF is installed. The value of ZF comes from the C3 FPU flag, which, counting from the table above, is set for an unordered case.

In fact, all branch code can be expressed in pseudocode as:

 jump if C3 = 1

So, looking at the table above, it is clear that if one of the operands is NaN, then any comparison of floating point equality is evaluated as equal.

Windows 64-bit compiler

The comparison operator is compiled as follows:

 TestNaNs.dpr.19: if nanDouble <> zeroDouble then
 0000000000428EB8 F20F100548E50000 movsd xmm0, qword ptr [rel $ 0000e548]
 0000000000428EC0 660F2E0548E50000 ucomisd xmm0, qword ptr [rel $ 0000e548]
 0000000000428EC8 7A02 jp TestNaNs + $ 5C
 0000000000428ECA 7420 jz TestNaNs + $ 7C

The comparison is performed by the ucomisd command. The manual provides this code:

 RESULT ← UnorderedCompare (SRC1 [63: 0] <> SRC2 [63: 0]) {
 (* Set EFLAGS *)
 CASE (RESULT) OF
   GREATER_THAN: ZF, PF, CF ← 000;
   LESS_THAN: ZF, PF, CF ← 001;
   EQUAL: ZF, PF, CF ← 100;
   UNORDERED: ZF, PF, CF ← 111;
 ESAC;
 OF, AF, SF ← 0;

Please note that in this instruction the flags ZF, PF and CF are exactly the same as flags C3, C2 and C0 on block x87.

The fork is processed using this code:

 0000000000428EC8 7A02 jp TestNaNs + $ 5C
 0000000000428ECA 7420 jz TestNaNs + $ 7C

Note that the PF parity flag ( jp command) is checked first, and then the ZF zero flag ( jz instruction). Therefore, the compiler emits code to handle an unordered case (i.e., One of the operands is NaN). This is handled first with jp . Once this is processed, the compiler then checks the ZF zero flag, which (since the NaNs were processed) is set if and only if both operands are equal.

Conclusion

Different behaviors boil down to different compilers making different decisions on how to implement comparison operators. In both situations, the hardware is compatible with IEEE 754 and is ideal for comparing NaN as specified by the standard.

My best guess was that the decisions for the 32-bit compiler were made a very long time ago. Some of these decisions are doubtful. In my opinion, a comparison of equality with the NaN operand should be evaluated not equal regardless of the other operand. The weight of the story, felt by the desire to maintain backward compatibility, means that these dubious decisions have never been considered.

When the 64-bit compiler was created, most recently Embarcadero engineers decided to fix some of these errors. They allegedly felt that breaking with the new architecture allowed them to do so.

In an ideal world, a 32-bit compiler can be configured in the same way as a 64-bit compiler by installing the compiler.

+4
source share

All Articles