When compressing and encrypting, should you first compress or encrypt?

If I were in an AES-encrypted file and then ZLIB-compressed it, would compression be less efficient than if I compressed and then encrypted first?

In other words, should it be compressed or encrypted first, or does it matter?

+50
performance encryption aes compression zlib
Jan 13 '11 at 2:01
source share
7 answers

Compression first. After encrypting the file, you will create a stream of random data that will not be compressible. The compression process depends on the search for compressible patterns in the data.

+55
Jan 13 2018-11-11T00:
source share

Compression before encryption is certainly more economical, but less secure at the same time. Therefore, I do not agree with the other answers.

Most compression algorithms use β€œmagic” file headers and can be used for statistical attacks.

For example, there is a CRIME SSL / TLS error .

+30
Apr 24 '15 at 0:27
source share

If your encryption algorithm is good (and AES with the correct chaining mode), then no compressor can compress the encrypted text. Or, if you prefer it the other way around: if you manage to compress some encrypted text, then it is time to question the quality of the encryption algorithm ...

This is because the output of the encryption system must be indistinguishable from purely random data, even from a specific attacker. The compressor is not a malicious attacker, but it works by trying to find nonrandom patterns that it can represent with fewer bits. The compressor cannot find such a pattern in ciphertext.

So, you must first compress the data, and then encrypt the result, and not vice versa. This is what is done in the OpenPGP format.

+12
Jan 13 '11 at
source share

Compression first. If you encrypt, your data will turn (essentially) into a stream of random bits. Random bits are incompressible because compression searches for patterns in the data, and random streams, by definition, have no patterns.

+7
Jan 13 2018-11-11T00:
source share

Of course, this is important. It is usually best to compress first and then encrypt.

ZLib uses Huffman coding and LZ77 compression . The Huffman tree will be more balanced and optimal if it is executed, for example, in plain text, and therefore, the compression speed will be better.

Encryption can be performed after compression, even if the result of the compression looks β€œencrypted”, but can be easily detected for compression, since the file usually starts from a PC.

ZLib does not provide encryption natively. This is why I implemented ZeusProtection . Source code is also available on github .

+1
Oct 16
source share

it is true that the compressor only works on data sets that have well-defined patterns, but it is preliminary to first encrypt data that gives well-designed non-random patterns that can be processed by the compressor with less time complexity.

0
Aug 20 '12 at 17:33
source share

From a practical point of view, I think you should compress first just because many files are pre-compressed. For example, video encoding is usually associated with heavy compression. If you encrypt this video file and compress it, it is now compressed twice. Not only will the second compression get a grim compression ratio, but compression again will require large resources to compress large files or streams. As pointed out by Thomas Pornin and Ferruccio , compressing encrypted files can in any case have little effect due to the randomness of the encrypted files.

I think that the best and easiest policy may be to compress files only as needed (using whitelist or blacklist) and then encrypt them independently.

0
Aug 13 '14 at 21:11
source share



All Articles