import unicodedata as ud astr=u"\N{LATIN SMALL LETTER E}" + u"\N{COMBINING ACUTE ACCENT}" combined_astr=ud.normalize('NFC',astr)
'NFC' tells ud.normalize to apply canonical decomposition ('NFD'), then compose pre-combined characters:
print(ud.name(combined_astr))
Both of them print the same thing:
print(astr)
But their views are different:
print(repr(astr))
And their encodings, say utf_8 , (and not surprisingly) also differ:
print(repr(astr.encode('utf_8')))
source share