Writing to a StringIO object using Pandas Excelwriter?

I can pass the StringIO object to pd.to_csv () just fine:

io = StringIO.StringIO() pd.DataFrame().to_csv(io) 

But when using excel author, I have more problems.

 io = StringIO.StringIO() writer = pd.ExcelWriter(io) pd.DataFrame().to_excel(writer,"sheet name") writer.save() 

Returns

 AttributeError: StringIO instance has no attribute 'rfind' 

I am trying to create an ExcelWriter object without calling pd.ExcelWriter() , but I have some problems. This is what I have tried so far:

 from xlsxwriter.workbook import Workbook writer = Workbook(io) pd.DataFrame().to_excel(writer,"sheet name") writer.save() 

But now I get AttributeError: 'Workbook' object has no attribute 'write_cells'

How can I save pandas dataframe in excel format for StringIO object?

+8
python pandas excel stringio xlsxwriter
source share
3 answers

Pandas expects the file name path to the ExcelWriter constructors, although each of the write engines supports StringIO . Perhaps this should be picked up as an error / function request in Pandas.

At the same time, here is an example of a workaround using the Pandas xlsxwriter mechanism:

 import pandas as pd import StringIO io = StringIO.StringIO() # Use a temp filename to keep pandas happy. writer = pd.ExcelWriter('temp.xlsx', engine='xlsxwriter') # Set the filename/file handle in the xlsxwriter.workbook object. writer.book.filename = io # Write the data frame to the StringIO object. pd.DataFrame().to_excel(writer, sheet_name='Sheet1') writer.save() xlsx_data = io.getvalue() 

Update : Starting with Pandas 0.17, you can now do this more directly:

 # Note, Python 2 example. For Python 3 use: output = io.BytesIO(). output = StringIO.StringIO() # Use the StringIO object as the filehandle. writer = pd.ExcelWriter(output, engine='xlsxwriter') 

See also Saving Dataframe output to a string in XlsxWriter documents.

+16
source share

A look at the source of pandas.io.excel looks like this should not be too big a problem if you don't mind using xlwt as your author. Other engines may not be that difficult, but xlwt jumps out just as easily, as its save method takes a stream or file path.

You need to pass the file name first to make pandas happy, as it checks the file name extension on the engine to make sure it is supported. But in the case of the xlwt engine, it simply fills the file name in the path attribute of the object, and then uses it in the save method. If you change the attribute of the path to your stream, it will happily save this stream when calling the save method.

Here is an example:

 import pandas as pd import StringIO import base64 df = pd.DataFrame.from_csv('http://moz.com/top500/domains/csv') xlwt_writer = pd.io.excel.get_writer('xlwt') my_writer = xlwt_writer('whatever.xls') #make pandas happy xl_out = StringIO.StringIO() my_writer.path = xl_out df.to_excel(my_writer) my_writer.save() print base64.b64encode(xl_out.getvalue()) 

This is a quick, easy, and slightly dirty way to do this. BTW ... a cleaner way to do this is to subclass ExcelWriter (or one of its existing subclasses like _XlwtWriter), but to be honest, there are so few involved in updating the path attribute, I voted to show you a simple way, not to go a little longer route.

+5
source share

For those who do not use xlsxwriter as engine= for to_excel , this solution is to use openpyxl in memory:

 in_memory_file = StringIO.StringIO() xlw = pd.ExcelWriter('temp.xlsx', engine='openpyxl') # ... do many .to_excel() thingies xlw.book.save(in_memory_file) # if you want to read it or stream to a client, don't forget this in_memory_file.seek(0) 

Explanation: The ExcelWriter wrapper ExcelWriter provides engines with a separate workbook through the .book property. For openpyxl you can use the Workbook.save method as usual!

+2
source share

All Articles