Is it possible to catch an exception that includes non-English characters in python 2?

I am trying to raise an exception in python 2.7.x, which includes unicode in the message. I can not make it work.

Not supported or not recommended to include unicode in msg message? Or do I need to look at sys.stderr?

# -*- coding: utf-8 -*- class MyException(Exception): def __init__(self, value): self.value = value def __str__(self): return self.value def __repr__(self): return self.value def __unicode__(self): return self.value desc = u'something bad with field \u4443' try: raise MyException(desc) except MyException as e: print(u'Inside try block : ' + unicode(e)) # here is what i wish to make work raise MyException(desc) 

Running the script displays the result below. Inside my try / except, I can print the line without any problems.

My problem is outside of try / except.

 Inside try block : something bad with field 䑃Traceback (most recent call last): File "C:\Python27\lib\bdb.py", line 387, in run exec cmd in globals, locals File "C:\Users\ghis3080\r.py", line 25, in <module> raise MyException(desc) MyException: something bad with field \u4443 

Thanks in advance.

+6
source share
3 answers

The behavior depends on the version of Python and the environment. On Python 3, the character encoding error handler for sys.stderr always 'backslashreplace' :

 from __future__ import unicode_literals, print_function import sys s = 'unicode "\u2323" smile' print(s) print(s, file=sys.stderr) try: raise RuntimeError(s) except Exception as e: print(e.args[0]) print(e.args[0], file=sys.stderr) raise 

python3:

 $ PYTHONIOENCODING=ascii:ignore python3 raise_unicode.py unicode "" smile unicode "\u2323" smile unicode "" smile unicode "\u2323" smile Traceback (most recent call last): File "raise_unicode.py", line 8, in <module> raise RuntimeError(s) RuntimeError: unicode "\u2323" smile 

python2 :

 $ PYTHONIOENCODING=ascii:ignore python2 raise_unicode.py unicode "" smile unicode "" smile unicode "" smile unicode "" smile Traceback (most recent call last): File "raise_unicode.py", line 8, in <module> raise RuntimeError(s) RuntimeError 

This is an error message on my system on python2.

Note: on Windows you can try:

 T:\> set PYTHONIOENCODING=ascii:ignore T:\> python raise_unicode.py 

For comparison:

 $ python3 raise_unicode.py unicode "⌣" smile unicode "⌣" smile unicode "⌣" smile unicode "⌣" smile Traceback (most recent call last): File "raise_unicode.py", line 8, in <module> raise RuntimeError(s) RuntimeError: unicode "⌣" smile 
+1
source

This is how Python works. I believe that what you see comes from traceback._some_string() in the main Python library. In this module, when the stack trace is executed, the code in this method first tries to convert the message using str() , and then, if it throws an exception, converts the message using unicode() , and then converts it to ascii using encode("ascii", "backslashreplace") . You get reliable output, and everything works correctly, I assume that Python is doing everything possible to pseudo-down convert the error message so that it displays without problems regardless of the platform running it. This is just a unicode code for your character. This does not happen in your try/except block, because this conversion is something specific to the mechanism that creates stack traces (for example, in the case of uncaught exceptions).

+2
source

In my case, your example worked as it should, printing beautiful Unicode.

But sometimes you have a lot of problems with the exception stack printed without (or with escaped / inverse characters) unicode characters. You can overcome the obstacle and print regular messages.

Example exit problem (Python 2.7, linux):

 # -*- coding: utf-8 -*- desc = u'something bad with field ¾' raise SyntaxError(desc.encode('utf-8', 'replace')) 

It will only print a truncated or screwed message:

 ~/.../sources/C_patch$ python SO.py Traceback (most recent call last): File "SO.py", line 25, in <module> raise SyntaxError(desc) SyntaxError 

To see immutable unicode, you can encode it to raw bytes and throw exceptions into the object:

 # -*- coding: utf-8 -*- desc = u'something bad with field ¾' raise SyntaxError(desc.encode('utf-8', 'replace')) 

This time you will see the full message:

 ~/.../sources/C_patch$ python SO.py Traceback (most recent call last): File "SO.py", line 3, in <module> raise SyntaxError(desc.encode('utf-8', 'replace')) SyntaxError: something bad with field ¾ 

You can do value.encode('utf-8', 'replace') in your constructor if you want, but with a system exception, you will have to do this in the raise , as in the example.

The hint is here: Overcoming disappointment: using unicode correctly in python2 (there is a large library with many helpers, and all of them can be divided into the above example).

+1
source

All Articles