Boolean [] vs BitSet: which is more efficient?

What is more efficient in terms of memory and processor usage - boolean or BitSet? No specific BitSet methods are used, only get / set / clear (==, =, Arrays.fill respectively for the array).

+52
java performance arrays memory
Mar 03 '09 at 5:40
source share
8 answers

In some tests with SJ JDK 1.6 calculations with a sieve (the best of 10 iterations to warm up is to give the JIT compiler a chance and eliminate random scheduling delays of the Core 2 Duo T5600 1.83 GHz):

BitSet is more memory efficient than boolean [], except for very small sizes. Each Boolean value in the array takes a byte. The numbers from runtime.freeMemory () are a bit confusing for BitSet, but less.

boolean [] is more efficient in terms of processor efficiency, except for very large sizes, where they are approximately equal. For example, for a size of 1 million Booleans [], it is about four times faster (for example, 6 ms versus 27 ms), for ten and one hundred million they are approximately equal.

+35
Mar 03 '09 at 7:41
source share
  • Boolean[] uses approximately 4-20 bytes for each boolean.
  • Boolean[] uses approximately 1 byte for each boolean.
  • BitSet uses approximately 1 bit for each logical value.

Memory size may not be a problem for you, in which case boolean [] may be easier for the code.

+39
Mar 03 '09 at 21:04
source share

A bit of the left margin of your question, but if memory is a concern, you might want to look into Huffman's compression . For example, 00000001 can be compressed in frequency to a level equivalent to {(7)0, (1)1} . The more "randomized" line 00111010 requires a more complex representation, for example. {(2)0, (3)1, (1)0, (1)1, (1)0} and take up more space. Depending on the structure of your bit data, you may get some benefit from using it, except for BitSet .

+4
Mar 03 '09 at 21:42
source share

It depends as always. Yes, BitSet is more memory efficient, but as soon as you need multi-threaded access, a logical [] may be a better choice. For example, to calculate prime numbers, you set the boolean value to true and therefore you do not need synchronization. Hans Boehm wrote an article about this, and the same method can be used to mark nodes in a graph.

+3
Mar 03 '09 at 15:16
source share

In terms of memory, the documentation for BitSet has pretty obvious implications. In particular:

Each bit has a current size, which is the number of bits of space the bit is currently in use. Note that size is related to the implementation of the bit set, so it may change with the implementation. bit set length refers to the logical length of the bit set and is determined independently of the implementation.

The source code for the Java library classes is openly available, and you can easily verify this for yourself . In particular:

 The internal field corresponding to the serialField "bits". 89 90 private long[] words; 

As for speed; it depends on what he does. In general, do not think about speed ahead of time; use any tool that is most convenient for semantics and leads to the clearest code. Optimize only after you notice that performance requirements are not being met and bottlenecks are identified.

Coming to SO and asking if A is faster than B is stupid for many reasons, including but not limited to:

  • It depends on the application, which no one in general has access to. Analyze and profile it in the context in which it is used. Make sure this is a bottleneck that is really worth optimizing.
  • Questions like this, which ask speed, usually indicate that the OP thinks they care about efficiency, but do not want to profile and do not define performance requirements. Under the surface, usually a red flag, that the OP is moving in the wrong way.

I know this is an old question, but it has appeared recently; and I think it's worth adding.

+3
Nov 05 '13 at 12:14
source share

Switching from Java to CPU is completely VM dependent. For example, it used to be that a logical value was implemented as a 32-bit value (most likely, this is still true).

If you don’t know that this will be important, you better write code to be clear, profile it, and then fix parts that are slow or consume a lot of memory.

You can do it when you go. For example, I once decided not to call .intern () in Strings, because when I ran the code in the profiler, it slowed it down too much (despite less memory).

+1
Mar 03 '09 at 5:45
source share

Here you can see the memory / time test comparing the logical matrix [] [] of a triangle with the triangular matrix BitSet [].

I create, set and read the values ​​(size * (size-1) / 2) and compare the memory usage and time usage ...

enter image description here

enter image description here

Hope this helps ...

Here is the code ... (just a damn dirty test code, sorry;)

 import java.util.BitSet; import java.util.Date; public class BooleanBitSetProfiler { Runtime runtime; int sum = 0; public void doIt() { runtime = Runtime.getRuntime(); long[][] bitsetMatrix = new long[30][2]; long[][] booleanMatrix = new long[30][2]; int size = 1000; for (int i = 0; i < booleanMatrix.length; i++) { booleanMatrix[i] = testBooleanMatrix(size); bitsetMatrix[i] = testBitSet(size); size += 2000; } int debug = 1; for (int j = 0; j < booleanMatrix.length; j++){ System.out.print(booleanMatrix[j][0] + ";"); } System.out.println(); for (int j = 0; j < booleanMatrix.length; j++){ System.out.print(booleanMatrix[j][1] + ";"); } System.out.println(); for (int j = 0; j < bitsetMatrix.length; j++){ System.out.print(bitsetMatrix[j][0] + ";"); } System.out.println(); for (int j = 0; j < bitsetMatrix.length; j++){ System.out.print(bitsetMatrix[j][1] + ";"); } System.out.println(); } private long memory () { return runtime.totalMemory() - runtime.freeMemory(); } private long[] testBooleanMatrix(int size) { runtime.gc(); long startTime = new Date().getTime(); long startMemory = memory(); boolean[][] matrix = new boolean[size][]; for (int i = 0; i < size; i++) { matrix[i] = new boolean[size - i - 1]; } long creationMemory = memory(); long creationTime = new Date().getTime(); for (int i = 0; i < size; i++) { for (int j = 0; j < matrix[i].length; j++) { matrix[i][j] = i % 2 == 0; } } long setMemory = memory(); long setTime = new Date().getTime(); for (int i = 0; i < size; i++) { for (int j = 0; j < matrix[i].length; j++) { if (matrix[i][j]) sum++; } } long readTime = new Date().getTime(); System.out.println("Boolean[][] (size " + size + ")"); System.out.println("Creation memory " + printMem(creationMemory-startMemory) + ", set memory " + printMem(setMemory-startMemory)); System.out.println("Creation time " + printTime(creationTime-startTime) + ", set time " + printTime(setTime - creationTime) + " read time " + printTime(readTime - setTime) + "\n"); runtime.gc(); return new long[]{(setMemory-startMemory)/(1024*1024), (readTime-startTime)}; } private long[] testBitSet(int size) { runtime.gc(); long startTime = new Date().getTime(); long startMemory = memory(); BitSet[] matrix = new BitSet[size]; for (int i = 0; i < size; i++) { matrix[i] = new BitSet(size - i - 1); } long creationMemory = memory(); long creationTime = new Date().getTime(); for (int i = 0; i < size; i++) { for (int j = 0; j < matrix[i].size(); j++) { matrix[i].set(j, (i % 2 == 0)); } } long setMemory = memory(); long setTime = new Date().getTime(); for (int i = 0; i < size; i++) { for (int j = 0; j < matrix[i].size(); j++) { if (matrix[i].get(j)) sum++; } } long readTime = new Date().getTime(); System.out.println("BitSet[] (size " + size + ")"); System.out.println("Creation memory " + printMem(creationMemory-startMemory) + ", set memory " + printMem(setMemory-startMemory)); System.out.println("Creation time " + printTime(creationTime-startTime) + ", set time " + printTime(setTime - creationTime) + " read time " + printTime(readTime - setTime) + "\n"); runtime.gc(); return new long[]{(setMemory-startMemory)/(1024*1024), (readTime-startTime)}; } private String printMem(long mem) { mem = mem / (1024*1024); return mem + "MB"; } private String printTime(long milis) { int seconds = (int) (milis / 1000); milis = milis % 1000; return seconds > 0 ? seconds + "s " + milis + "ms" : milis + "ms"; } } 
+1
Jul 19 '18 at 11:38
source share

I believe that BitSet is more efficient in terms of memory and CPU, whether it can internally pack bits into int, longs, or native types, whereas boolean [] requires a byte for each data bit. In addition, if you used other methods (and, or, etc.), you will find that BitSet is more efficient, since there is no need to iterate over each element of the array; bitwise math is used instead.

-one
Mar 03 '09 at 5:45
source share



All Articles