Python: lower () german umlaut

I have a problem with converting uppercase letters with umlauts to lower case.

print("ÄÖÜAOU".lower()) 

A, O and U are converted correctly, but Ä, Ö and Ü remain in uppercase. Any ideas?

The first problem is fixed with .decode ('utf-8'), but I still have the second:

 # -*- coding: utf-8 -*- original_message="ÄÜ".decode('utf-8') original_message=original_message.lower() original_message=original_message.replace("ä", "x") print(original_message) 

Traceback (last last call): File "Untitled.py", line 4, in original_message = original_message.replace ("ä", "x") UnicodeDecodeError: codec 'ascii' cannot decode byte 0xc3 at position 0: serial number not in the range (128)

+6
source share
3 answers

You will need to mark it as a unicode string if you are not working with simple ASCII;

 > print(u"ÄÖÜAOU".lower()) äöüaou 

It works the same when working with variables, it all depends on the type assigned to the variable to start with.

 > olle = "ÅÄÖABC" > print(olle.lower()) ÅÄÖabc > olle = u"ÅÄÖABC" > print(olle.lower()) åäöabc 
+8
source

You are dealing with encoded strings, not unicode text.

The byte string method .lower() can only process ASCII values. Decode your string in Unicode or use the unicode literal ( u'' ) and then lowercase:

 >>> print u"\xc4AOU".lower() äaou 
+2
source

If you are using Python 2 but don't want the u "" prefix on all of your lines, put this at the top of your program:

 from __future__ import unicode_literals olle = "ÅÄÖABC" print(olle.lower()) 

will now return:

 åäöabc 

The encoding determines how to interpret the characters read from the disk into the program, but from the __ future __ import statement it indicates how to interpret these lines in the program itself. You will probably need both.

+1
source

All Articles