How to use multiple threads to compress zlib (same input source)

My goal is to compress single source data in parallel streams. I defined the tasks that are in the list, these tasks have information to read (500 kb-1 MB in each task).

My compressor threads compress each block of data using the ZLIB and store it in the outbuf of the respective jobs.

Now I want to combine all this and create one output file that has the standard ZLIB format.

From ZLIB RFC and after looking at the source of pigzee, I understand that

ZLIB header is similar to below

     +---+---+
     |CMF|FLG| (2 bytes)
     +---+---+
     +---+---+---+---+
     |     DICTID    | (4 bytes. Present only when FLG.FDICT is set)
     +---+---+---+---+
     +=====================+
     |...compressed data...| (variable size of data)
     +=====================+
     +---+---+---+---+
     |     ADLER32   |  (4 bytes of variable data)
     +---+---+---+---+

In my case there is no dictionary.

Therefore, when I combine two compressed blocks, the title of all blocks is the same.

Therefore, I do the following.

  • For the first block, I write the header + compressed data.

  • ( )

  • adlrer32_combine() adler 32, .

, , .

- - ? .

+4
1

. , .

pigz , . Z_SYNC_FLUSH . , . , . n-1 1 .

+4

All Articles