Encoding.GetEncoding (437). GetString () Error?

I have the following test program

char c = '§'; Debug.WriteLine("c: " + (int)c); byte b = Encoding.GetEncoding(437).GetBytes("§")[0]; Debug.WriteLine("b: " + b); char c1 = Encoding.GetEncoding(437).GetString(new byte[] { 21 })[0]; Debug.WriteLine("c1: " + (int)c1); 

This leads to the following result:

 c: 167 b: 21 c1: 21 

As I can see here GetBytes is working correctly
167 in unicode => 21 in CP437
but getstring is not working
21 in CP437 => 21 in Unicode

Is this a mistake or my mistake?

+8
c # character-encoding
source share
2 answers

CP437 is not "two-way" for characters in the range 0-31. As indicated on the Wikipedia page you provided:

For many applications, codes ranging from 0 to 31 and code 127 will not produce these characters. Some (or all) of them will be interpreted as ASCII control characters.

Matching a Unicode character to a supported CP437 character in this range works, but not vice versa. For example, take the characters represented by bytes 13 and 10: most likely, if you received them inside line CP437, you really want to save carriage return characters and lines that should be saved, and not converted to bullet and note. This is normal: this is not a mistake.

+7
source share

.net supports two different characters, both of which (usually) appear as § :

 char c1 = (char)21; char c2 = (char)167; Console.WriteLine(c1 == c2); // prints false Console.WriteLine(c1); // prints § Console.WriteLine(c2); // prints § 

Character 21 is a special control character that is displayed as § when exiting in text mode.

CP437 allows you to interpret 21 as either a control character or a literal § . Apparently, GetString decides to interpret it as a control character (which is a perfectly valid option) and thus maps it to a Unicode 21 control character, not a Unicode § literal.

0
source share

All Articles