Mass Compression (Zip) Files

Usage: Our users have many objects in our AWS S3 account. We are adding a function to download multiple projects at once. We care more about efficiency than storage.

After looking at the various options (ZipArchive, PclZip), I came across this guide recommending using Chilkat.

This method makes a lot of sense and summarizes it as follows:

  • Pre-load each file at boot and save it to S3
  • "Project download" starts the download of each compressed file, then QuickAppend (Chilkat terminology), which then "instantly" (200 ms per file) adds them to the general compressed file
  • Download the new Zip file on S3, specify the link

The problem I'm facing is a $ 249 Chilkat license, and I'm looking for free alternatives.

An alternative (also free) uses a similar concept:

  • Pre-load each file at boot and save it to S3
  • "Project download" starts downloading each compressed file and then tar together
  • Download the new Zip file on S3, specify the link

Is there a “standard” or “perfect” way to handle this?

+8
php compression tar zip chilkat
source share
2 answers

On my local system, the built-in PHP zip library is able to combine a 10-bit 24-megabyte zip file into a 21-bit 51 MB file in 800 ms, which is comparable to the 200 ms / file you reported, but I'm not sure how big your files are or what equipment you use.

Unlike the Java library that the author of your manual originally used, the PHP zip library is implemented in C, so you won’t see the same Java-C performance boost that the author saw. Having said that, I don’t know how Chillkat QuickAppend QuickAppend or how it compares with the PHP zip library, but it adds to the pre-compressed files, regardless of whether you do it with PHP or Chillkat, as if the fastest solution.

 $destination = new ZipArchive; $source = new ZipArchive; if($source->open('a.zip') === TRUE && $destination->open('b.zip') === TRUE) { $time_start = microtime(true); $temp_dir = "/tmp/zip_" . time(); mkdir($temp_dir,0777,true); $source->extractTo($temp_dir); $source->close(); $files = scandir($temp_dir); $file_count = 0; foreach($files as $file) { if($file == '.' || $file == '..') continue; $destination->addFile("$temp_dir/$file"); ++$file_count; } $destination->close(); exec("rm -rf $temp_dir &"); $time_end = microtime(true); $time = $time_end - $time_start; print "Added $file_count files in " . ($time * 1000). "ms \n"; } 

Exit

 -rw-rw-r-- 1 fuzzytree fuzzytree 24020997 Jun 4 15:57 a.zip -rw-rw-r-- 1 fuzzytree fuzzytree 51418980 Jun 4 15:57 b.zip fuzzytree@atlas:~/testzip$ php zip.php Added 10 files in 872.43795394897ms fuzzytree@atlas:~/testzip$ ls -ltr *zip -rw-rw-r-- 1 fuzzytree fuzzytree 24020997 Jun 4 15:57 a.zip -rw-rw-r-- 1 fuzzytree fuzzytree 75443030 Jun 4 15:57 b.zip 
+2
source share

I have a website where people often download tens or even hundreds of files (up to 100 MB, if I had to guess) in one zip file. I am using zipstream , which I think I found here . I'm not sure about the limitations, but it seems to work well and there is no need to cheat individual files in advance.

0
source share

All Articles