@ nye17 Officially it is not recommended to ever call setdefaultencoding() (it is removed from sys after the first use for a reason). One of the main culprits is gtk, which causes all kinds of problems, so if IPython imported gtk, sys.getdefaultencoding() will return utf8. IPython does not set the default encoding.
@wim may I ask which version of IPython you are using? The overhaul part in 0.11 was fixed by many unicode errors, but more is happening (mostly on Windows now).
I ran a test script in IPython 0.11, and the behavior of IPython and Python seems the same, so I think this bug is fixed.
Relevant Values:
- sys.stdin.encoding = utf8
- sys.getdefaultencoding () = ascii
- tested platforms: Ubuntu 10.04 + Python2.6.5, OSX 10.7 + Python2.7.1
As for the explanation, in fact, IPython did not recognize that the input could be unicode. In IPython 0.10, utf8 multibyte input is not respected, so each byte = 1 character, which you can see with
In [1]: x = '$โฌ%' In [2]: x Out[2]: '$\xe2\x82\xac%' In [3]: y = u'$โฌ%' In [4]: y Out[4]: u'$\xe2\x82\xac%'
While what should happen and what happens in 0.11 is that y == x.decode(sys.stdin.encoding) , not repr(y) == 'u'+repr(x)
minrk source share