What is the best way to look for binary data for variable-length bit strings?

Can someone tell me the best way to decode binary data with variable length bit string in java?

For instance:

Binary data: 10101000 11100010 01100001 01010111 01110001 01010110

I may need to find the first match of any of the following 01, 100, 110, 1110, 1010 ...

In this case, the match will be 1010. Then I need to do the same for the rest of the binary data. Bit strings can be up to 16 bits long and cross byte boundaries.

Basically, I'm trying to use jpeg to decode Huffman using bit strings that I created from the Huffman tables in the headers. I can do this, only it is very dirty, I turn everything, including binary data, into Stringbuffers, and I know that this is the wrong way.

Before I loaded everything in string buffers, I tried using only numbers in binary format, but of course I cannot ignore leading 0s in code like 00011. I am sure there must be some smart way to use bit operators and would like to do it, but I was looking at pages explaining bit masks and left shifts, etc., and I still don't know!

Thanks for the help!

EDIT:

Thanks for all the suggestions. I went with a binary tree approach as it seems to be the standard way with Huffman stuff. It really makes sense, because Huffman codes are created using trees. I will also store binary data that I need to search in large integers. I don’t know how to indicate several answers correctly, but thanks anyway.

+4
source share
5 answers

Since you are decoding Huffman encoded data, you must create a binary tree in which the decoded bit string is held as data, and the bits of each Huffman code are the path to the corresponding data. Huffman code bits are accessed by bit-shift and bit-mask operations. When you get to a leaf, you output data on that leaf and return to the root of the tree. It is very fast and efficient.

+3
source

You can use a state machine consuming zeros and ones. The state machine will have final states for all the patterns you want to detect. Whenever it enters one of the final states, a message is sent with a matching pattern and returns to its original state.

Finally, you will only have one state machine in the form of a DAG that contains all of your templates.

To implement it, use the state template ( http://en.wikipedia.org/wiki/State_pattern ) or any other state machine implementation.

+4
source

You can try filling it with BigInteger using change and test methods. Then use a loop to walk and take each additional pattern.

If the huffman code is in the tree, 1 == right node, 0 == left node.

for( int i =numbitsTotal; i > 0; --i ) { int bit = bigInt.testBit( i ); if( bit == 1 ) { // take right node -- if null accept code, apply from top } else { // take left node -- if null accept code, apply from top } } 
+1
source

I would suggest trie . It is explicitly intended to search for a prefix. In your case, it will be a binary version.

+1
source

You can use java.util.BitSet to store your binary data, and then you can implement some search functions to find the position of the smaller bit set inside the big ...

0
source

All Articles