What is the meaning of unions in C?

I look at the O'Reilly Practical C programming book and read the K & R book in the C programming language, and I really have difficulty understanding the concept of unions.

They take the size of the largest data type that forces them ... and the last one overwrites the rest ... but why not just use / free memory if necessary?

The book mentions that it is used in communication, where you need to set flags of the same size; and on the googled website that it can eliminate odd-sized chunks of memory ... but can it be used in a modern, non-built-in memory space?

Is there something tricky you can handle and processor registers? Is it just a holdback from an earlier era of programming? Or is it, like the infamous goto, still has some powerful application (perhaps in narrow memory spaces) that makes it useful?

+7
source share
5 answers

Well, you almost answered your question: Memory. In those days, memory was pretty low, and even saving a few kilobytes was useful.

But even today there are scenarios in which unions would be useful. For example, if you want to implement some type of type variant . The best way to do this is to use union.

This doesn't seem like much, but let's say you want to use a variable either by storing a 4-digit string (e.g. ID), or a 4-byte number (it could be some hash or even just a number).

If you use the classic struct , it will be 8 bytes long (at least if you're out of luck, bytes are also filled there). Using union , this is only 4 bytes. Thus, you save 50% of the memory, which is not so much for one instance, but imagine that you have a million.

While you can achieve similar actions by casting or subclassing, merging is still the easiest way to do this.

+5
source

One use of joins has two variables that occupy the same space, and the second variable in the structure determines the type of data you want to read.

eg. you can have a boolean "isDouble" and a union "doubleOrLong" that has both double and long. If isDouble == true interprets the union as double another, it interprets it as long.

Another use of unions is access to data types in different views. For example, if you know how a double is laid out in memory, you can put a double in a union, access it as another data type, for example long, directly access its bits, its mantissa, its sign, its exponent, regardless , and do direct manipulation with them.

You really do not need this at the moment, since memory is so cheap, but in embedded systems it has its advantages.

+1
source

The Windows API uses a lot of unions. LARGE_INTEGER is an example of such a use. Basically, if the compiler supports 64-bit integers, use the QuadPart element; otherwise, set the low DWORD and high DWORD manually.

0
source

This is actually not a hold, since the C language was created in 1972, when memory was a real problem.

You can make the argument that in a modern, non-embedded space, you can not use C as a programming language to start with. If you have chosen C as your choice of language for implementation, you want to take advantage of C: it is efficient, close to metal, which leads to dense fast binary files.

So, when choosing to use C, you still want to take advantage of the benefits that include memory efficiency. To which the Union works very well; allowing you to have some degree of type safety while providing access to the smallest hard copy of memory.

0
source

One place where I saw it is used in the implementation of Doom 3 / idTech 4 Fast Inverse Square Root .

For those unfamiliar with this algorithm, it essentially requires the processing of a floating-point number as a whole. An older version of Quake (and an earlier version) does this as follows:

 float y = 2.0f; // treat the bits of y as an integer long i = * ( long * ) &y; // do some stuff with i // treat the bits of i as a float y = * ( float * ) &i; 

github source

This code takes the address of the floating-point number y , sends it to a pointer to a long one (i.e., a 32-bit integer in Quake days) and decrypts it to i . He then does some incredibly weird bit-twiddling things, and vice versa.

There are two drawbacks to this. One of them is that the complicated process of addressing, filling, dereferencing leads to the fact that the value of y read from memory, and not from register 1 but on the way back. However, on Quake computers, floating point registers and whole registers were completely separate, so you pretty much had to click on memory and go back to deal with this limitation.

The second is that, at least in C ++, such casting is deeply condemned, even if you do what makes up voodoo, for example, this function. I'm sure there are more compelling arguments, but I'm not sure what it is :)

So in Doom 3, the identifier included the following bit in its new implementation (which uses a different set of bit-bits, but a similar idea):

 union _flint { dword i; float f; }; ... union _flint seed; seed.i = /* look up some tables to get this */; double r = seed.f; // <- access the bits of seed.i as a floating point number 

github source

Theoretically, on an SSE2 machine, this can be accessed through a single register; In practice, I'm not sure if any compiler will do it. This is still somewhat cleaner code, in my opinion, than casting games in an earlier version of Quake.


1 - ignoring the arguments of the "sufficiently advanced compiler"
0
source

All Articles