Error: unsupported format or corrupt file: expected write to BOF

I am trying to open an xlsx file and just print its contents. I continue to encounter this error:

import xlrd book = xlrd.open_workbook("file.xlsx") print "The number of worksheets is", book.nsheets print "Worksheet name(s):", book.sheet_names() print sh = book.sheet_by_index(0) print sh.name, sh.nrows, sh.ncols print print "Cell D30 is", sh.cell_value(rowx=29, colx=3) print for rx in range(5): print sh.row(rx) print 

It displays this error

 raise XLRDError('Unsupported format, or corrupt file: ' + msg) xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '\xff\xfeT\x00i\x00m\x00' 

thanks

+8
source share
6 answers

The error message refers to the BOF (beginning of file) entry of the .xls file. However, the example shows that you are trying to read the XLSX file.

There are two possible reasons for this:

  • Your version of xlrd is deprecated and does not support reading xlsx files.
  • The XLSX file is encrypted and thus stored in the OLE Compound Document format, rather than the zip format, which makes it xlrd an older XLS file.

Double check that you are using the latest version of xlrd. Opening a new XLSX file with data in only one cell should verify this.

However, I would suggest that you are faced with the second condition and that the file is encrypted, as you state above that you are already using xlrd version 0.9.2.

XLSX files are encrypted if you explicitly use the workbook password, and also if you password protect some elements of the worksheet. Thus, it is possible to have an encrypted XLSX file, even if you do not need a password to open it.

Update : see @BStew, the third, more likely, answer that the file is open by Excel.

+10
source

There is also a third reason. The case when the file is already opened by Excel. It generates the same error.

+15
source

And maybe the fourth reason you used read_excel to read the csv file. (It happened to me ...)

+14
source

You may get this error when the xlsx file is actually html; You can open it with a text editor to check it out. When I received this error, I decided to use it with pandas:

 import pandas as pd df_list = pd.read_html('filename.xlsx') df = pd.DataFrame(df_list[0]) 
+3
source

In my case, the problem was with the shared folder itself.

ITEM CASE: I have a shared folder on WIN2012 Server where the user deletes the .xlsx file and then uses my python script to load this xlsx file into the database table.

Despite the fact that the user deleted the old file and placed it in the file that was supposed to be uploaded, the BOF error continued to mention the byte string and the username in the byte string - nowhere else where inside the xlsx file on any sheet was the username. Also, when I copied .xlsx to a newly created folder and ran a script linking to this new folder, it worked.

As a result, I deleted the shared folder and realized that 5 items were deleted, although only one item was visible to me and the user. I think this is due to my lack of Windows administration skills, but that was the culprit.

0
source

I got the same error message. It looks so strange to me, because the script works for xlsx files in another folder and the files are almost the same.

I still don’t know why this happened. But finally, I copied all the Excel files to another folder, and the script worked. An opportunity to try if none of the above suggestions work for you ...

0
source

All Articles