I am developing a new microprocessor instruction set ( www.forwardcom.info ) and I want to use the NAN distribution to track errors. However, there are a few oddities in the IEEE 754 floating point standard that prevent this.
Firstly, the reason why I want to use NAN propagation rather than error capturing is because I have variable-length vector registers. If, for example, I have a float vector with 8 elements, and I have 1/0 in the first element and 0/0 in the sixth element, then I get only one trap, but if I run the same program on the computer and a half the length of the vector, then I get two traps: one for infinity and one for NAN. I want the result not to depend on the length of the vector, so I need to rely on the distribution of NAN and INF, and not on the capture. The NAN and INF values will be propagated through computations so that they can be verified in the final result. The NAN view contains some bits called payloads that can be used to provide information about the source of the error.
However, there are two problems with the IEEE 754 floating point standard that prevent reliable propagation of NAN values.
The first problem is that a combination of two NANs with different payloads is just one of two values. For example, NAN1 + NAN2 gives NAN1. This violates the fundamental principle: a + b = b + a. The compiler can exchange operands to get different results for different compilers or with different optimization options. I prefer to get a bitwise OR combination of two payloads. This will work if you have one bit for each error condition, but certainly not if the payload contains more complex information (for example, NAN boxing in languages with dynamic types). The standards committee actually discussed the OR'ing solution (see http://grouper.ieee.org/groups/754/email/msg01094.html ). I do not know why they rejected this offer.
The second problem is that the min and max functions do not propagate the NAN if only one of the inputs is a NAN. In other words, min (1, NAN) = 1. Reliable NAN propagation will of course require that min (1, NAN) = NAN. I have no idea why the standard talks about this.
In a new microprocessor system called ForwardCom, I want to avoid these unfortunate quirks and point out that NAN1 + NAN2 = NAN1 | NAN2 and min (1, NAN) = NAN.
And now to my questions: First, do I need an option switch to change between strict IEEE compliance and reliable NAN propagation? Quoting the standard:
Silent NaNs should, at the discretion of the developers, provide retrospective diagnostic information inherited from invalid or inaccessible data and results. To facilitate the dissemination of the diagnostic information contained in NaNs, as much of this information as possible should be stored in the results of NaN operations.
Note that the standard states “should” here, where it “should” elsewhere. Does this mean that my deviation from the recommendation is acceptable?
And the second question: I can not find examples where the distribution of NAN is actually used to track errors. Perhaps this is due to the weaknesses of the standard. I want to define different payload bits for different error conditions, for example:
0/0, 0 * ∞, ∞ / ∞, modulo (1,0), modulo (∞, 1), ∞ -∞ and other errors related to infinity and division by zero.
sqrt (-1), log (-1), pow (-1,0.1) and other errors arising from logarithms and degrees.
asin (2) and other mathematical functions.
Explicit purpose. This can be useful when the variable is initialized to the NAN.
There are many free bits for custom error codes.
Was it done before, or do I need to invent all this from scratch? Are there any issues that I have to consider (except for NAN boxing in some languages)