How to read (or parse) EXCEL comments using python

I have several excel files that use a lot of comments to save information. For example, one cell has a value of 2, and there is a comment attached to the cell saying "2008: 2 # 2009: 4". seems to be a value of 2 for the value of the current year (2010). The comment contains all the values ​​of the previous year, separated by the symbol "#". I would like to create a dictionary to store all this information, for example {2008: 2, 2009: 4, 2010: 2}, but I don’t know how to parse (or read) this comment attached to the cell. Does the Python excel readin module have this function (read in a comment)?

+4
source share
2 answers

Usually for reading from Excel I suggest using xlrd, but xlrd does not support comments. Use the Excel COM object instead:

from win32com.client import Dispatch xl = Dispatch("Excel.Application") xl.Visible = True wb = xl.Workbooks.Open("Book1.xls") sh = wb.Sheets("Sheet1") comment = sh.Cells(1,1).Comment.Text() 

But how to parse a comment:

 comment = "2008:2#2009:4" d = {} for item in comment.split('#'): key, val = item.split(':') d[key] = val 

Often Excel comments are in two lines, with the first line indicating who created the comment. If so, your code will look something like this:

 comment = """Steven: 2008:2#2009:4""" _, comment = comment.split('\n') d = {} for item in comment.split('#'): key, val = item.split(':') d[key] = val 
+3
source

You can do this without an Excel COM object using openpyxl :

 from openpyxl import load_workbook workbook = load_workbook('/tmp/data.xlsx') first_sheet = workbook.get_sheet_names()[0] worksheet = workbook.get_sheet_by_name(first_sheet) for row in worksheet.iter_rows(): for cell in row: if cell.comment: print(cell.comment.text) 

The analysis of comments in itself may be the same as with the answer of Stephen Rumbalski.

(example adapted from here )

0
source

All Articles