Exponent Size and Percentage in float256

Question

Exponent Size and Percentage in float256

You better look at the table to understand what I want:

╔════════╦════════╦════════════╦════════════╗ ║ name ║ sign ║ exponent ║ fraction ║ ╠════════╬════════╬════════════╬════════════╣ ║float16 ║ 1 ║ 5 ║ 10 ║ ╠════════╬════════╬════════════╬════════════╣ ║float32 ║ 1 ║ 8 ║ 23 ║ ╠════════╬════════╬════════════╬════════════╣ ║float64 ║ 1 ║ 11 ║ 52 ║ ╠════════╬════════╬════════════╬════════════╣ ║float128║ 1 ║ 15 ║ 112 ║ ╠════════╬════════╬════════════╬════════════╣ ║float256║ 1 ║ ???? ║ ???? ║ ╠════════╬════════╬════════════╬════════════╣ ║float512║ 1 ║ ???? ║ ???? ║ ╚════════╩════════╩════════════╩════════════╝

My question is how to calculate the number of bits for the exponent and fractions with the total number of bits, e.g. 256, 512 or 1024.

+4

floating point

Muhammad aladdin Aug 14 '11 at 20:59

source share

3 answers

There is no 256-bit double value in IEEE 754-2008 floating point.

The number of bits in the formats is not calculated, they are randomly selected to give a certain accuracy and range. If you want to create your own 256-bit floating-point number format, you can simply choose the sizes that give you the precision and range you want.

0

Guffa Aug 14 '11 at 21:09

source share

The values in the table are specified in the IEEE 754-2008 standard , which reaches 128 bits. If you have hardware or software that implements floating point even more bits, you need to consult its documentation.

0

Antti Aug 14 '11 at 21:13

source share

Stephen canon · Accepted Answer · 2011-08-14T21:28:53+0000

The earliest drafts of IEEE-754 (2008) defined guidelines for what exponential widths and significant fields of floating arbitrary widths should be. This was not a strict requirement, but simply a recommended practice. It was considered that this was too cumbersome for minimal benefits, so it was generally excluded from the standard and replaced by:

Language standards should define mechanisms that support extensible accuracy for each supported base. Language support Extensible precision should allow users to specify p and emax. Language standards also allow extensible precision to be specified by specifying only p; in this case emax is determined by the locale should be at least 1000 × p when p ≥ 237 bits in binary format or p ≥ 51 digits in decimal format.

(3.7 Extended and expandable fixes, p14).

However, the standard still defines (unnecessarily) “interchange formats” of each size of 32 bits larger than 128 in the tables in section 3.6 (p13). In particular, the binary format of width k has the exponent round(4*log2(k)) - 13 bits. For a specific case k=256 this gives:

 exponent: round(4*log2(256)) - 13 = 32 - 13 = 19 significand: 256 - 1 - 19 = 236

For the 384-bit wide format following this formula, the exponent width will be:

 round(4*log2(384)) - 13 = round(34.339850002884624) - 13 = 21 bits

Keep in mind that for arbitrary precision floating point arithmetic, there are many packages that do not meet these guidelines. This is just a definition of the "binary256" exchange format, and not what any particular implementation necessarily uses.

Exponent Size and Percentage in float256

More articles: