I wrote unzipper in Javascript. It is working.
He relies on Andy G.P. Na for reading binary files and some RFC1951 inflate the logic from notmasteryet . I added the ZipFile class.
working example:
http://cheeso.members.winisp.net/Unzip-Example.htm (dead link)
Source:
http://cheeso.members.winisp.net/srcview.aspx?dir=js-unzip (dead link)
NB : links are dead; Iβll find a new owner soon.
The demo page ZipFile.htm and 3 different scripts are used as the source, one for the zipfile class, one for the inflate class and one for the binary reader class. The demo also depends on jQuery and jQuery UI. If you just download the js-zip.zip file, there will be all the necessary source.
Here is what the application code looks like in Javascript:
// In my demo, this gets attached to a click event. // it instantiates a ZipFile, and provides a callback that is // invoked when the zip is read. This can take a few seconds on a // large zip file, so it asynchronous. var readFile = function(){ $("#status").html("<br/>"); var url= $("#urlToLoad").val(); var doneReading = function(zip){ extractEntries(zip); }; var zipFile = new ZipFile(url, doneReading); }; // this function extracts the entries from an instantiated zip function extractEntries(zip){ $('#report').accordion('destroy'); // clear $("#report").html(''); var extractCb = function(id) { // this callback is invoked with the entry name, and entry text // in my demo, the text is just injected into an accordion panel. return (function(entryName, entryText){ var content = entryText.replace(new RegExp( "\\n", "g" ), "<br/>"); $("#"+id).html(content); $("#status").append("extract cb, entry(" + entryName + ") id(" + id + ")<br/>"); $('#report').accordion('destroy'); $('#report').accordion({collapsible:true, active:false}); }); } // for each entry in the zip, extract it. for (var i=0; i<zip.entries.length; i++) { var entry = zip.entries[i]; var entryInfo = "<h4><a>" + entry.name + "</a></h4>\n<div>"; // contrive an id for the entry, make it unique var randomId = "id-"+ Math.floor((Math.random() * 1000000000)); entryInfo += "<span class='inputDiv'><h4>Content:</h4><span id='" + randomId + "'></span></span></div>\n"; // insert the info for one entry as the last child within the report div $("#report").append(entryInfo); // extract asynchronously entry.extract(extractCb(randomId)); } }
The demonstration works in two stages: readFile fn is launched by clicking and creates a ZipFile object that reads the zip file. There's an asynchronous callback for when the read completes (usually takes less than a second for reasonable zips) - in this demo, the callback is stored in the local variable doneReading, which simply calls extractEntries , which simply blindly unpacks the entire contents of the provided zip file . In a real application, you will probably select some of the records to be extracted (allow the user to select or select one or more records programmatically, etc.)
extractEntries fn extractEntries over all records and calls extract() on each of them, passing a callback. Recording decompression takes time, possibly 1 s or more, for each record in a zip file, which means that asynchrony is appropriate. The extract callback simply adds the extracted content to the jQuery accordion on the page. If the content is binary, then it is formatted as such (not shown).
This works, but I think the utility is somewhat limited.
On the one hand: it is very slow. It takes ~ 4 seconds to unzip the 140k AppNote.txt file from PKWare. The same uncompress can be done in less than 0.5 s in a .NET program. EDIT : Javascript ZipFile decompresses significantly faster than it is now in IE9 and Chrome. It is still slower than a compiled program, but for normal browser use it is fast enough.
For another: it does not stream. These are basically blockages in the entire contents of the zipfile in memory. In a "real" programming environment, you can only read the metadata of a zip file (say, 64 bytes per record), and then read and unpack other data as desired. It is impossible to do IO, as in javascript, as far as I know, so the only option is to read the entire zip in memory and make random access in it. This means that large zip files will have unreasonable system memory requirements. Not much for a small zip file.
In addition: it does not process the zip file "general register" - there are many zip options that I did not bother to implement in unzipper - like ZIP encryption, WinZip encryption, zip64, UTF- 8 encoded file names, etc. ( EDIT - now it processes UTF-8 encoded file names). However, the ZipFile class handles the basics. Some of these things will not be difficult to implement. I have an AES encryption class in Javascript; which can be integrated to support encryption. Zip64 support is likely to be useless for most Javascript users, as it is designed to support> 4gb zipfiles - no need to extract them in a browser.
I also have not tested the case of unpacking binary content. He is unpacking the text right now. If you have a zipped binary, you will need to edit the ZipFile class for proper processing. I did not understand how to do this. Now it makes binary files.
EDIT . I updated the JS library and demo. Now it contains binaries, in addition to text. I made it more stable and more general - now you can specify the encoding used when reading text files. In addition, the demo version is expanded - it shows, among other things, the XLSX file in the browser.
So, although I think it has limited utility and interest, it works. I assume this will work in Node.js.