Question
I am parsing large compressed files in Python 2.7.6 and would like to know the size of the uncompressed file before running. I am trying to use the second technique presented in this SO answer . It works with bzip2 format files, but not with gzip formatted files. What is different from the two compression algorithms that cause this?
Code example
This code snippet demonstrates behavior, assuming you have "test.bz2" and "test.gz" in the current working directory:
import os
import bz2
import gzip
bz = bz2.BZ2File('test.bz2', mode='r')
bz.seek(0, os.SEEK_END)
bz.close()
gz = gzip.GzipFile('test.gz', mode='r')
gz.seek(0, os.SEEK_END)
gz.close()
The following trace is displayed:
Traceback ( ):
"zip_test.py", 10,
gn.seek(0, os.SEEK_END) "/usr/lib64/python2.6/gzip.py", 420, ValueError ( " " )
ValueError:
*.bz2, *.gz?