Java - Gzip Concurrency

I was assigned to parallelize GZip in Java 7, and I'm not sure if this is possible.

Appointment:

  • Parallelize gzip using a given number of threads
  • Each stream receives a 1024 KiB block, using the last 32-bit KiB block of the previous block as a dictionary. It is possible to use no dicitionary
  • Reading from stdin and stdout

What I tried:

  • I tried using GZIPOutputStream, but there seems to be no way to isolate and parallelize deflate (), and also cannot access deflater to modify the dictionary. I tried to extend GZIPOutputStream, but it did not seem to act the way I wanted, since I still could not isolate compression / deflation.
  • I tried using Deflater with the wrapper turned on, and FilterOutputStream - output compressed bytes, but I could not compress it correctly in GZip format. I made sure that each stream has a compressor that will write to the byte array, then it will write to the OutputStream.

I am not sure that I was mistaken in my approaches or completely accepted the wrong approaches. Can someone point me in the right direction for which classes will be used for this project?

+5
source share
4 answers

Yep, zipping , . , gzipping ? .

+4

, , . , .

( )

+1

, , . , , gzip, deflater reset, - . reset , , , ( ) (, , , ) , reset, deflater . . "!" (!)

I don’t know if this will work, and I suspect that the complexity of all this will make it an inappropriate choice, unless you are compressing single, very large files. (If you had many files, it would be much easier to compress each of them in parallel.) However, this is what I will try first.

(Also note that the gzip format is just a deflated stream with additional metadata.)

+1
source

All Articles