UCS-2 codec cannot encode characters at 1050-1050

When I run my Python code, I get the following errors:

File "E:\python343\crawler.py", line 31, in <module> print (x1) File "E:\python343\lib\idlelib\PyShell.py", line 1347, in write return self.shell.write(s, self.tags) UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 1050-1050: Non-BMP character not supported in Tk 

Here is my code:

 x = g.request('search', {'q' : 'TaylorSwift', 'type' : 'page', 'limit' : 100})['data'][0]['id'] # GET ALL STATUS POST ON PARTICULAR PAGE(X=PAGE ID) for x1 in g.get_connections(x, 'feed')['data']: print (x1) for x2 in x1: print (x2) if(x2[1]=='status'): x2['message'] 

How can i fix this?

+10
source share
3 answers

Your data contains characters outside the base multilingual plane . For example, emojis are outside the BMP, and the window system used by IDLE, Tk, cannot handle such characters.

You can use the translation table to match anything outside the BMP with the replacement character :

 import sys non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd) print(x.translate(non_bmp_map)) 

non_bmp_map displays all code points outside BMP (any code point above 0xFFFF, up to the highest Unicode code point that your Python version can handle ) in U + FFFD CHANGE CHARACTER :

 >>> print('This works outside IDLE! \U0001F44D') This works outside IDLE! πŸ‘ >>> print('This works in IDLE too! \U0001F44D'.translate(non_bmp_map)) This works in IDLE too!   
+26
source

None of them worked for me, but the following. This assumes that public_tweets was extracted from tweepy api.search

 for tweet in public_tweets: print (tweet.text) u=tweet.text u=u.encode('unicode-escape').decode('utf-8') 
+2
source

this unicode problem was seen in python 3.6 and earlier, to solve it, just upgrade python to python 3.8 and use your code. This error will not occur.

0
source

All Articles