I know that passing the argument compression='gzip' in pd.read_csv() , I can save DataFrame in compressed CSV file.
my_df.to_csv('my_file_name.csv', compression='gzip')
I also know that if I want to add DataFrame to an existing CSV file, I can use mode='a' , for example,
my_df.to_csv('my_file_name.csv', mode='a', index=False)
But what if I want to add to the end of DataFrame compressed CSV file? Is it possible? I tried to do this with
my_df.to_csv('my_file_name.csv', mode='a', index=False, compression='gzip')
But the resulting CSV has not been compressed, albeit in very good condition.
This question motivated my processing large CSV file with the Pandas. I need to create a compressed output of CSV and process a CSV file in pieces in DataFrame, so I did not run in MemoryError. Therefore, the most obvious logic to me is the addition of each output block DataFrame together in a compressed zip file.
I'm using Python 3.4 and Pandas 0.16.1.
source share