From password protected Excel file to pandas DataFrame

I can open a password protected Excel file with this:

import sys import win32com.client xlApp = win32com.client.Dispatch("Excel.Application") print "Excel library version:", xlApp.Version filename, password = sys.argv[1:3] xlwb = xlApp.Workbooks.Open(filename, Password=password) # xlwb = xlApp.Workbooks.Open(filename) xlws = xlwb.Sheets(1) # counts from 1, not from 0 print xlws.Name print xlws.Cells(1, 1) # that A1 

I am not sure how to pass information to pandas dataframe. Do I need to read cells one by one and that's it, or is there a convenient way to do this?

+6
source share
2 answers

Assuming the start cell is set to (StartRow, StartCol) and the end cell is set to (EndRow, EndCol), I found the following for me:

 # Get the content in the rectangular selection region # content is a tuple of tuples content = xlws.Range(xlws.Cells(StartRow, StartCol), xlws.Cells(EndRow, EndCol)).Value # Transfer content to pandas dataframe dataframe = pandas.DataFrame(list(content)) 

Note: Excel Cell B5 is listed as line 5, col 2 in win32com. In addition, we need a list (...) to convert from tuples of tuples to a list of tuples, since the pandas.DataFrame constructor for a tuple of tuples does not exist.

+2
source

Assuming that you can save the encrypted file to disk using the win32com API (which, as I understand it, can lead to a target hit), you can immediately call the top-level pandas function read_excel . You will need to first install some combination of xlrd (for Excel 2003), xlwt (also for 2003) and openpyxl (for Excel 2007). Here is the documentation for reading in Excel files. Pandas does not currently support win32com API support for reading Excel files. You can open the GitHub problem if you want.

+1
source

All Articles