Reading gsipped csv file in python 3

I have problems reading from gsipped csv file with gzip and csv libs. Here is what I got:

 import gzip import csv import json f = gzip.open(filename) csvobj = csv.reader(f,delimiter = ',',quotechar="'") for line in csvobj: ts = line[0] data_json = json.loads(line[1]) 

but this throws an exception:

  File "C:\Users\yaronol\workspace\raw_data_from_s3\s3_data_parser.py", line 64, in download_from_S3 self.parse_dump_file(filename) File "C:\Users\yaronol\workspace\raw_data_from_s3\s3_data_parser.py", line 30, in parse_dump_file for line in csvobj: _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?) 

gunzipping file and opening that csv works fine. I also tried to decode the text of the file to convert from bytes to str ...

What am I missing here?

+5
source share
2 answers

The default mode for gzip.open is rb , if you want to work with strings, you must specify it additionally:

 f = gzip.open(filename, mode="rt") 

OT: good practice to write I / O operations in a block with a block:

 with gzip.open(filename, mode="rt") as f: 
+10
source

You open the file in binary mode ( which is the default for gzip ).

Try instead:

 import gzip import csv f = gzip.open(filename, mode='rt') csvobj = csv.reader(f,delimiter = ',',quotechar="'") 
+3
source

All Articles