X.ToCharArray (). Length EQUALS GetBytes (X) .Length

string s = "test"; int charCount = s.ToCharArray().Length; int byteCount = System.Text.Encoding.Default.GetBytes(s).Length; 

When can (charCount! = ByteCount) happen? I believe in the case of Unicode characters, but not in the general case.

.NET supports Unicode characters, but is this the default encoding (System.Text.Encoding.Default) for .NET? "System.Text.Encoding.Default" shows "System.Text.SBCSCodePageEncoding" as an encoding, which is a single byte.

+1
source share
2 answers

The default encoding is UTF8, which can contain 1-4 bytes of space per character.

charCount and byteCount will not be equal if any character in string s uses more than 1 byte.

To force 4 bytes, you can check with Unicode encoding, then byteCount will be = 8.

 int byteCount = System.Text.Encoding.Unicode.GetBytes(s).Length; 
+5
source

The number of characters will differ from the number of bytes when you use an encoding that uses more than one byte per character. This applies to several encodings, including UTF-16 (internal representation of .NET strings) and UTF-32.

+1
source

All Articles