Maximum and minimum double precision floating point metrics

In accordance with the IEEE Std 754-2008 standard, a double64 double64 precision double-precision exponential format field width is 11 bits, which is offset by an exponent offset of 1023. The standard also indicates that the maximum rate is 1023 and the minimum is 1022. Why is the maximum value not specified:

2^10 + 2^9 + 2^8 + 2^7 + 2^6 + 2^5 + 2^4 + 2^3 + 2^2 + 2^1 + 2^0 - 1023 = 1024 

And the minimum indicator is not:

 0 - 1023 = -1023 

Thanks!

+4
source share
1 answer

The bits for the exponent have two reserved values: one for encoding 0 and subnormal numbers, and one for encoding ∞ and NaN. As a result, the normal range is two less than you would otherwise expect. See binary64 3.4 of the IEEE-754 standard ( w - the number of bits in the exponent is 11 in the case of binary64 ):

The biased coding range of E should include:

- each integer from 1 to 2 w - 2 inclusive for encoding normal numbers

- reserved value 0 for encoding ± 0 and subnormal numbers

- reserved value 2 w - 1 for encoding ± ∞ and NaNs.

+7
source

All Articles