Why do all data types have a capacity of 2?

Question

Why do all data types have a capacity of 2?

Why do all data types always have a capacity of 2?

Let's take two examples: short int 16

char 8

Why aren't they like the following?

 short int 12

+7

c ++ programming-languages

Vivek goel Mar 04 '11 at 9:21

source share

9 answers

The reason that built-in types are such dimensions is simply because this is what supports processors initially, that is, it is the fastest and easiest. No other reason.

As with structures, you can have variables that have (almost) any number of bits, but you usually want to stay with built-in types if there is no really urgent reason for this.

You also usually want to group types of the same size together and start a structure with the largest types (usually pointers).
This will avoid unnecessary filling, and this will ensure that you do not have access penalties that some CPUs display inconsistent fields (some processors may even throw an exception when access is uneven, but in this case the compiler will add an add-on to avoid it).

+3

Damon Mar 04 '11 at 9:23

source share

The size of char, short, int, long, etc. differs depending on the platform. 32-bit architectures have char = 8, short = 16, int = 32, long = 32. 64-bit architectures have char = 8, short = 16, int = 32, long = 64.

Many DSPs do not have 2 types of power. For example, Motorola DSP56k (a bit outdated) has 24-bit words. The compiler for this architecture (from Tasking) has char = 8, short = 16, int = 24, long = 48. To make things confusing, they made alignment char = 24, short = 24, int = 24, long = 48. This due to the fact that it does not have byte addressing: the minimum available block is 24 bits. This has the exciting (annoying) property of involving a lot of divide / modulo 3 when you really need to access an 8-bit byte in an array of packed data.

You will find only non-power-of-2 in special-purpose cores, the size of which corresponds to a special pattern of use, with the advantage of performance and / or power. In the case of 56k, this was due to the fact that there was a block with incremental addition that could load two 24-bit quantities and add them to the 48-bit result in one cycle on three buses simultaneously. The entire platform was designed around him.

The main reason most general-purpose architectures use authority-2 is because they are standardized on an octet (8-bit bytes) as the minimum size type (except for flags). There is no reason why it could not be 9 bits, and, as indicated elsewhere, 24 and 36 bits were common. This would permeate the rest of the design: if x86 were a 9-bit byte, we would have 36 octet cache lines, 4,608 octet pages and 569 KB would be enough for everyone :) We probably would not have "nibbles", though, since you cannot split 9 bits in half.

This is currently almost impossible to do. All this is very good with such a system as from the very beginning, but interacting with data created by systems with 8 bit bytes would be a nightmare. It is already quite difficult to parse 8-bit data in a 24-bit DSP.

+3

John ripley Mar 04 '11 at 9:31

source share

Well, they are powers of 2 because they are multiples of 8, and this is a little simplified because the atom allocation block in memory is usually a byte that ( edit : often, but not always) is executed using 8 bits.

Large data sizes are produced using several bytes at a time. So you could have dimensions of 8.16,24.32 ....

Then, for the memory access speed, only forces of 2 are used as a factor of the minimum size (8), so you get the data sizes for these lines:

  8 => 8 * 2^0 bits => char 16 => 8 * 2^1 bits => short int 32 => 8 * 2^2 bits => int 64 => 8 * 2^3 bits => long long int

+2

garph0 Mar 04 '11 at 9:29

source share

8 bits is the most common size for a byte (but not the only size, examples of 9 bit bytes and other byte sizes are not difficult to find). Larger data types are almost always a multiple of the byte size, so they will usually be 16, 32, 64, 128 bits in systems with 8-bit bytes, but not always with a capacity of 2, for example. 24 bits are common to DSPs, and there are 80-bit and 96-bit floating point types.

+1

Paul r Mar 04 '11 at 9:30

source share

The sizes of standard integral types are defined as 8 bits, since byte is 8 bits (with a few extremely rare exceptions), and the processor data bus usually has 8 bits.

If you really need 12-bit integers, you can use bit fields in structures (or unions) as follows:

 struct mystruct { short int twelveBitInt : 12; short int threeBitInt : 3; short int bitFlag : 1; };

This can be convenient in embedded / low-level environments - but keep in mind that the overall size of the structure will still be packed in full size.

0

Grahams Mar 04 '11 at 9:32

source share

They are not necessary. On some machines and compilers, sizeof(long double) == 12 (96 bits).

0

fredoverflow Mar 04 '11 at 9:37

source share

It is not necessary that all data types use power 2 as the number of bits to represent. For example, a long double uses 80 bits (although its implementation depends on how many bits are allocated).

One of the benefits you get when using power 2 is that larger data types can be represented as smaller ones. For example, 4 characters (8 bits each) can be int (32 bits). In fact, some compilers used to simulate 64-bit numbers using two 32-bit numbers.

0

Shamim hafiz Mar 04 '11 at 9:47

source share

In most cases, your computer tries to save all data formats, either as a whole a few (2, 3, 4 ...), or the whole part (1/2, 1/3, 1/4 ...) of the size of the machine data. He does this so that every time he loads N data words, he loads an integer number of bits of information for you. Thus, subsequently he should not recombine the details.

You can see this in x86, for example:

a char - 1/4 of 32-bit

short is 1/2 of 32 bits

int / long - integer 32 bits

long long 2x 32 bit

float is one 32-bit

a double is two times 32 bits

a long double can be three or four times 32 bits, depending on your compiler settings. This is because for 32-bit machines there are three boot machine words (so there is no overhead) to load 96 bits. On 64-bit machines, this is 1.5 native machine word, so 128 bits will be more efficient (without recombination). The actual data content of the long double on x86 is 80 bits, so both of them are already supplemented.

After all, a computer does not always load its own data size. First, it extracts the cache line, and then reads it from its own machine words. The cache line is larger, usually around 64 or 128 bytes. It is very useful to have a significant bit of data in this and not get stuck on the edge, since you have to download two whole cache lines to read them. That is why most computer structures have twice the power; it will fit into any degree of storage of two sizes, either half, completely, or twice - you are guaranteed to never be on the border.

0

dascandy Mar 04 '11 at 10:02

source share

Marcelo cantos · Accepted Answer · 2011-03-04T09:24:27+0000

This is an implementation detail, and this is not always the case. Some exotic architectures have non-two data types. For example, 36-bit words were common at one stage.

The reason the forces of the two are nearly universal these days is because it usually simplifies internal hardware implementations. As a hypothetical example (I don’t deal with hardware, so I have to admit that this is mainly guesswork), the part of the operation code indicating how large one of its arguments can be stored as a power index of two bytes in the argument, such thus, two bits are sufficient to express which of the 8, 16, 32, or 64 bits is an argument, and the circuits needed to convert this signal to the corresponding latch signals would be quite simple.

Why do all data types have a capacity of 2?

More articles: