Tolower () does not work for Ü, Ö in C ++

When I tried tolower () with non-English charecters in C ++, it did not work fine. I was looking for this problem and I came across something about the locale, but I'm not sure about the best solution to this.

My sample code is below:

printf("%c ",tolower('Ü')); 
+4
source share
2 answers

Unfortunately, the C ++ standard library does not have sufficient support to change the case of all possible non-English characters (compared to those characters that have case variants in general). This limitation is due to the fact that the C ++ standard assumes that one character and its case variants occupy exactly one char object (or wchar_t object for wide characters) and for non-English characters that are not guaranteed to be true (also depending on how characters are encoded).

If your environment uses single-byte encoding for the corresponding characters, this may give you what you want:

 std::cout << std::tolower('Ü', locale()); 

With wide characters, you are probably more lucky:

 std::wcout << std::tolower(L'Ü', locale()); 

but even this will not give the correct result for toupper(L'ß') , which will be the two-character sequence L"SS" ).

If you need support for all characters, check out the ICU library , in particular the case mappings section.

+6
source

As Bart has shown, C ++ just doesn't like multibyte encodings. Fortunately, you can use Boost.Local to solve this problem without any extra hassle. Here is a simple example:

 #include <iostream> #include <locale> #include <boost/locale.hpp> int main() { boost::locale::generator gen; std::locale loc = gen("en_US.UTF-8"); std::string line; while (std::getline(std::cin, line)) std::cout << boost::locale::to_lower(line, loc) << '\n'; } 

To compile, we need to link to the Boost.Locale library:

 g++ -lboost_locale lower.cpp -o lower 

And when we execute it, we get the following:

 $ ./main <<< 'ICH HÄTTE GERNE EINEN SÜßEN HASEN' ich hätte gerne einen süßen hasen 
+3
source

All Articles