Xlsx compressed rubyzip cannot be read by Excel

I am working on writing code that can read / write Excel xlsx files. xlsx files are just zip archives of several xml files, so to check if I can write a file, I used a gem called rubyzip to unzip the xlsx file and then immediately zip it back into a new archive without changing the data. However, when I do this, I cannot open a new excel file, it is considered corrupted.

Alternatively, if I use Mac OS X Archive Utility (my own application for processing zip files) and I unzip and rezip the excel zip file, the data is not corrupted and I can open the resulting file in Excel.

I found that this is not the “unzip" rubyzip functionality that corrupts the data, but the zip process. (In fact, when I use the archive utility in the new zip file created by rubyzip , the file is read by Excel again).

I wonder why this happens, and what solutions may be for zip content in a programmatic way that Excel reads.

My code for zipping:

 def compress(path) path.sub!(%r[/$],'') archive = File.join(path,File.basename(path))+'.zip' FileUtils.rm archive, :force=>true Zip::ZipFile.open(archive, 'w') do |zipfile| Dir["#{path}/**/**"].reject{|f|f==archive}.each do |file| temp = file zipfile.add(file.sub(path+'/',''),file) end end end 
+1
source share
1 answer

There are a number of limitations that the OOXML format imposes on the use of Zip in order for packages to be compatible. For example, the only compression method allowed in a package is DEFLATE.

You might want to check the specification of the OPC packages (in which .XSLX files) in Appendix C of the standard package available here (Zip), and then make sure that the rubyzip library does not do anything that is not allowed (for example, using the IMPLODE compression method).

+3
source

All Articles