Security of unpacking user files

This is not so much a coding issue as a general security issue. I am currently working on a project that allows the user to submit content. A key part of this content is the user downloading a zip file. Zip file should contain only mp3 files.

Then I unzip these files into a directory on the server so that we can transfer audio to the website so that users can listen to it.

My concern is that this opens up some potentially dangerous mail files to us. I have read about "zipbombs" in the past and obviously do not want the malicious zip file to cause damage.

So, is there a safe way to do this? Can I scan a zip file without unpacking it first, and if it contains anything other than MP3, delete it or put an alert to the administrator?

If that matters, I am developing a site on Wordpress. I am currently using the built-in wordpress upload functions so that the user can upload a zip file to our server (I'm not sure if there is any form of security in wordpress for scanning a zip file?)

+7
source share
3 answers

Code, only extract MP3 files from zip, ignore everything else

$zip = new ZipArchive(); $filename = 'newzip.zip'; if ($zip->open($filename)!==TRUE) { exit("cannot open <$filename>\n"); } for ($i=0; $i<$zip->numFiles;$i++) { $info = $zip->statIndex($i); $file = pathinfo($info['name']); if(strtolower($file['extension']) == "mp3") { file_put_contents(basename($info['name']), $zip->getFromIndex($i)); } } $zip->close(); 

I would like to use something like id3_get_version ( http://www.php.net/manual/en/function.id3-get-version.php ) so that the contents of the file are mp3 too

+3
source

Is there a reason they need zip files? If there are a lot of text frames in the ID3v2 file in MP3 files, the file size will actually increase with ZIP due to the storage of the dictionary.

As far as I know, there is no way to scan a ZIP without parsing it. Data is opaque until you run every bit through a Huffman dictionary. And how would you determine which file is an MP3? By file extension? By frame? MP3 encoders have a free standard (decoders have more stringent specifications), which makes it difficult to scan the file structure without false negatives.

Here are some ZIP security threats:

  • Comment out data that causes a buffer overflow. Solution: delete the comment data.
  • ZIP files that are small in size but bloat to fill the file system (classic ZIP bonus). Solution: check the inflated size before inflating; check the dictionary to make sure that it has a lot of records and that the compressed data is not all 1.
  • Nested ZIP addresses (C # 2 related). Solution: stop when the entry in the ZIP archive itself is ZIP data. You can determine this by checking the central directory marker, number 0x02014b50 (hex, always little-endian in ZIP - http://en.wikipedia.org/wiki/Zip_%28file_format%29#Structure ).
  • Nested directory structures designed to exceed the file system limit and freeze the deflation process. Solution: Do not unpack directories.

So, do a lot of cleanup and integrity checks, or at least use PHP to scan the archive; check each file for its MP3 version (however you do it - extension and presence of MP3 headers?). You cannot rely on them based on byte 0. http://en.wikipedia.org/wiki/MP3#File_structure ) and deflated file size ( http://www.php.net/manual/en/function.zip -entry-filesize.php ). Fail if the supercharged file is too large, or if there are any non-MP3 files.

+2
source

Use the following code: file names inside the .zip archive:

 $zip = zip_open('test.zip'); while($entry = zip_read($zip)) { $file_name = zip_entry_name($entry); $ext = pathinfo($file_name, PATHINFO_EXTENSION); if(strtoupper($ext) !== 'MP3') { notify_admin($file_name); } } 

Please note that the following code will only look at the extension. This means that the user can download anything that has the MP3 extension. To really check if the mp3 file is, you have to unzip it. I would advise you to do this in the temporary directory.

After unpacking the file, you can analyze it, for example, ffmpeg or something else. In any case, the detailed data on bitrate, track length, etc. are interesting.

If the analysis fails, you can mark the file.

+1
source

All Articles