Unicode is a very large character set that includes almost all characters in almost all languages.
There are several ways to save Unicode text as a sequence of bytes β these methods are called encodings. All Unicode encodings (well, all full Unicode encodings) can store all Unicode text as a sequence of bytes in some format - but the number of bytes that any given piece of text accepts will depend on the encoding used.
UTF-8 is a Unicode encoding optimized for English and other languages ββthat use very few characters outside the Latin alphabet. UTF-16 is a Unicode encoding, which is perhaps more suitable for text in different European languages. Java and .NET store all text in memory ( String class) in Unicode encoded by UTF-16.
source share