I am working on writing code that can read / write Excel xlsx files. xlsx files are just zip archives of several xml files, so to check if I can write a file, I used a gem called rubyzip to unzip the xlsx file and then immediately zip it back into a new archive without changing the data. However, when I do this, I cannot open a new excel file, it is considered corrupted.
Alternatively, if I use Mac OS X Archive Utility (my own application for processing zip files) and I unzip and rezip the excel zip file, the data is not corrupted and I can open the resulting file in Excel.
I found that this is not the “unzip" rubyzip functionality that corrupts the data, but the zip process. (In fact, when I use the archive utility in the new zip file created by rubyzip , the file is read by Excel again).
I wonder why this happens, and what solutions may be for zip content in a programmatic way that Excel reads.
My code for zipping:
def compress(path) path.sub!(%r[/$],'') archive = File.join(path,File.basename(path))+'.zip' FileUtils.rm archive, :force=>true Zip::ZipFile.open(archive, 'w') do |zipfile| Dir["#{path}/**/**"].reject{|f|f==archive}.each do |file| temp = file zipfile.add(file.sub(path+'/',''),file) end end end
vivek source share