UnicodeDecodeError: "utf-8" codec cannot decode bytes

Here is my code

for line in open('u.item'): #read each line 

whenever I run this code, it gives the following error:

 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 2892: invalid continuation byte 

I tried to solve this problem and add an additional parameter to open (), the code looks like this:

 for line in open('u.item', encoding='utf-8'): #read each line 

But again he gives the same error. what should i do then! Please, help.

+151
python character-encoding
Oct 31 '13 at 5:55 on
source share
10 answers

As suggested by Mark Ransom, I found the correct encoding for this problem. The encoding was "ISO-8859-1", so replacing open("u.item", encoding="utf-8") with open('u.item', encoding = "ISO-8859-1") will solve the problem.

+312
Oct 31 '13 at 12:35
source share
— -

Also worked for me, ISO 8859-1 is going to save a lot, hahaha, mainly if you use the speech recognition API

Example:

 file = open('../Resources/' + filename, 'r', encoding="ISO-8859-1"); 
+35
Oct 26 '17 at 19:49
source share

Your file does not actually have utf-8 encoded data; it contains some other encoding. Find out what encoding is and use it in an open call.

In Windows-1252 encoding, for example, 0xe9 will be the é symbol.

+25
Oct 31 '13 at 5:58 on
source share

Try reading it with pandas

 pd.read_csv('u.item', sep='|', names=m_cols , encoding='latin-1') 
+20
Jan 31 '17 at 20:35
source share

If you are using Python 2 , then there will be a solution:

 import io for line in io.open("u.item", encoding="ISO-8859-1"): # do something 

Since the encoding parameter does not work with open() , you will get the following error:

 TypeError: 'encoding' is an invalid keyword argument for this function
+11
Mar 03 '17 at 17:32
source share

If someone is looking for them, this is an example for converting a CSV file to Python 3:

 try: inputReader = csv.reader(open(argv[1], encoding='ISO-8859-1'), delimiter=',',quotechar='"') except IOError: pass 
+2
Sep 14 '16 at 19:24
source share

Sometimes, when open(filepath) in which filepath is not really a file, the same error may occur, so first make sure that the file you are trying to open exists:

 import os assert os.path.isfile(filepath) 

hope this helps.

+2
Aug 29 '18 at 3:58
source share

The simplest of all solutions:

Use Pandas to read the file, its very simple:

 import pandas as pd data = pd.read_csv('file_name.csv', encoding='utf-8') 
0
Dec 21 '17 at 10:02
source share

python3: reading from CSV file here, decoding method 'utf-8' will not work ZOMATO.csv - this is my CSV file name

 ZOMATO_df=pd.read_csv(io.StringIO(uploaded['ZOMATO.csv'].decode('ISO-8859-1'))) 
0
May 01 '19 at 1:52
source share

You can solve the problem with:

 for line in open(your_file_path, 'rb'): 

'rb' reads the file in binary mode. Find out more here . Hope this helps!

0
May 02 '19 at 2:15
source share



All Articles