How to format strings using unicode emdash?

I am trying to format a string using a unicode variable. For instance:

>>> x = u"Some text—with an emdash."
>>> x
u'Some text\u2014with an emdash.'
>>> print(x)
Some text—with an emdash.
>>> s = "{}".format(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2014' in position 9: ordinal not in range(128)

>>> t = "%s" %x
>>> t
u'Some text\u2014with an emdash.'
>>> print(t)
Some text—with an emdash.

You can see that I have a Unicode string and that it prints fine. The problem is that I am using a new Python function (and an improved one?) format(). If I use the old style (using %s), everything works fine, but when I use the {}and function format(), it fails.

Any ideas why this is happening? I am using Python 2.7.2.

+5
source share
3 answers

The new one format()doesn't say goodbye when you mix ASCII and unicode strings ... so try this:

s = u"{}".format(x)
+8

.

>>> s = u"{0}".format(x)
>>> s
u'Some text\u2014with an emdash.'
+3

Using for me worked well. This is an option with other answers.

>>> emDash = u'\u2014'
>>> "a{0}b".format(emDash)
'a—b'
+1
source

All Articles