C ++ 2.5 bytes (20-bit) integer

I know this is funny, but I need this to optimize the storage. Is there a good way to implement it in C ++?

It should be flexible enough so that I can use it as a regular data type, for example, Vector< int20 > , operator overloading, etc.

+4
source share
10 answers

If your focus is on storage, I suspect you need quite a few 20-bit variables. How to store them in pairs? You could create a class representing two such variables and store them in 2.5 + 2.5 = 5 bytes.

For convenient access to variables, you can override the [] operator so that you can write:

 int fst = pair[0]; int snd = pair[1]; 

Since you can allow manipulations such as

 pair[1] += 5; 

You will not want to return a copy of the support bytes, but a link . However, you cannot return a direct link to support bytes (since this will ruin this neighboring value), so you really need to return a proxy server for support bytes (which in turn has a link to support bytes) and allows the proxy server to overload corresponding operators.

As a fact, as @Tony suggests, you can generalize this to have a common container containing N such 20-bit variables.

(I did this myself by specializing a vector for efficiently storing Boolean elements (as separate bits).)

+9
source

No ... you cannot do this as a single semantic value type ... any class data must be a multiple of the 8-bit character size (inviting all the usual hints about CHAR_BITS, etc.).

However, let it lay on a straw ...

Unfortunately, you are obviously processing a lot of data items. If it's larger than 64k, any proxy object in a custom container with packed values ​​will probably also need a> 16-bit index / descriptor, but still one of the few features that I can see deserves further consideration. This can be convenient if you are only actively working and need a semantic value value for a small subset of values ​​at one point in time.

 struct Proxy { Int20_Container& container_; // might not need if a singleton Int20_Container::size_type index_; ... }; 

Thus, the proxy can be 32, 64 or more bits - the potential advantage is only that you can create them on the fly from the indices in the container, ask them to write back back to the container and keep them short-lived with several at the same time. (One simple way β€” not necessarily the fastest β€” to implement this model is to use the STL bitmap or vector as an Int20_Container and either store the logical index 20 times in index_ or multiply by the fly.)

It is also vaguely possible that although your values ​​vary in 20-bit space, in actual use you have less than 64k different values. If you have such an idea about your dataset, you can create a lookup table where the indexes of 16-bit arrays correspond to 20-bit values.

+6
source

Use a class. As long as you respect the semantics of STL / assign / clone / etc ... STL, you will have no problem.

But it will not optimize the memory space on your computer. Especially if you turn on the flat array, 20 bits will most likely be aligned at the 32-bit boundary, so the advantage of the 20-bit type is useless.

In this case, you will need to define your own type of optimized array, which may be STL compatible. But do not expect it to be fast. It will not be.

+4
source

Use the bit field. (I am really surprised that no one suggested this.)

 struct int20_and_something_else { int less_than_a_million : 20; int less_than_four_thousand : 12; // total 32 bits }; 

This only works as a mutual optimization of the elements in the structure where you can bridge the gaps with some other data. But it works very well!

If you really need to optimize a giant array of 20-bit numbers and nothing else, there are:

 struct int20_x3 { int one : 20; int two : 20; int three : 20; // 60 bits is almost 64 void set( int index, int value ); int get( int index ); }; 

You can add getter / setter functions to make them more beautiful if you want, but you cannot take the address of the bit field, and they cannot participate in the array. (Of course, you can have a struct array.)

Use as:

 int20_x3 *big_array = new int20_x3[ array_size / 3 + 1 ]; big_array[ index / 3 ].set( index % 3, value ); 
+3
source

You can use C ++ std :: bitset . Save everything in batete and access your data using the correct index (x20).

+2
source

You can use the union keyword to create a bit field. I used it back when bit fields were a must. Otherwise, you can create a class that contains 3 bytes, but through bitwise operations exposes only the most important 20.

+1
source

You cannot get exactly 20 bits as a type (even with a bit structure), since it will always be aligned (with the smallest grain size) to a byte. Imo the only way to go if you have to have 20 bits is to create a bit stream for processing data (which you can overload to accept indexing, etc.)

+1
source

As far as I know, this is not possible.

The easiest option is to define a custom type that uses int32_t as the backup storage and implements the corresponding mathematical operations as overriding operators.

For better storage density, you can save 3 int20 in a single int64_t value.

0
source

Just an idea: use optimized storage (5 bytes for two instances), and for operations, convert it to a 32-bit int, and then back.

0
source

While it can be done in several ways. One possibility would be to use the twidling bit to store them as the left and right parts of an array of 5 bytes in size with a storage / retrieval class that converts the desired yoiur array entry to the array entry in the byte5 [] array and extracts the left to the right, depending on the situation.

However, most hardware requires integers to be word-aligned just like bits to extract the integer you need to shift the bits to correctly align it.

I think it would be more efficient to increase the swap space and allow virtual memory to take care of your large array (after all, 20 vs 32 is not a big saving!), Always assuming you have a 64-bit OS.

0
source

All Articles