Best character set and mapping for a European website

I am going to create an application that will be used by people across Europe. I need to know which sort and character set are best for user input. Or should I make a separate table for each language. An article about something explaining this would be great.

Thanks:)

+4
source share
2 answers

Unicode is a very large character set that includes almost all characters in almost all languages.

There are several ways to save Unicode text as a sequence of bytes β€” these methods are called encodings. All Unicode encodings (well, all full Unicode encodings) can store all Unicode text as a sequence of bytes in some format - but the number of bytes that any given piece of text accepts will depend on the encoding used.

UTF-8 is a Unicode encoding optimized for English and other languages ​​that use very few characters outside the Latin alphabet. UTF-16 is a Unicode encoding, which is perhaps more suitable for text in different European languages. Java and .NET store all text in memory ( String class) in Unicode encoded by UTF-16.

+2
source

The character set is, without a doubt, UTF-8. Collation, I'm not sure there is a good answer to this question, but you can read this report .

+5
source

Source: https://habr.com/ru/post/1314065/


All Articles