Combining various JSON data compression methods in python3

So, I want to compress JSON data using another compressor. I used this to compress JSON.

import gzip import JSON with gzip.GzipFile('2.json', 'r') as isfile: for line in isfile: obj = json.loads(line) 

what causes the error.

 raise OSError('Not a gzipped file (%r)' % magic) OSError: Not a gzipped file (b'[\n') 

I also tried direct compression with.

 zlib_data= zlib.compress(data) 

what causes the error.

 return lz4.block.compress(*args, **kwargs) TypeError: a bytes-like object is required, not 'list' 

So basically I want to compress JSON using all the methods and calculate the time spent on compression in different methods.

+2
source share
1 answer

On python2.7

it seems like a problem like your data

data for compression must be of type 'str'

 import gzip import json import lz4 import time with gzip.GzipFile('data.gz','w') as fid_gz: with open('data.json','r') as fid_json: # get json as type dict json_dict = json.load(fid_json) # convert dict to str json_str = str(json_dict) # write string fid_gz.write(json_str) # check well maded with gzip.GzipFile('data.gz','r') as fid_gz : print(fid_gz.read()) 

even if gzip compression

 gzip.zlib.compress(json_str,9) 

even if lz4 compression

 lz4.block.compress(json_str) 

and the time check will be

 # set start time st = time.time() # calculate elasped time print(time.time() - st) 

On python3.5

the difference between python2.7 and python 3 is the type of your data to compress

data for compression should be "byte" types through bytes ()

when creating the .gz file

 with gzip.GzipFile('data.gz','w') as fid_gz: with open('data.json','r') as fid_json: json_dict = json.load(fid_json) json_str = str(json_dict) # bytes(string, encoding) json_bytes = bytes(json_str,'utf8') fid_gz.write(json_bytes) 

or just compress with gzip.compress (data, compresslevel = 9)

 # 'data' takes bytes gzip.compress(json_bytes) 

or just compress with zlib.compress (bytes, level = -1, /)

 gzip.zlib.compress(json_bytes,9) 

or just compress with lz4.bloc.compress (source, compression = 0)

 # 'source' takes both 'str' and 'byte' lz4.block.compress(json_str) lz4.block.compress(json_bytes) 

Measurement time depends on your intention.

amuses

+1
source

All Articles