GetBytes () returns a negative number

*"Hätten Hüte ein ä im Namen, wären sie möglicherweise keine Hüte mehr, sondern Häte." 72 -61 -92 116 116 101 ...* 

GetBytes () returns a negative number (-61, ()) in char 'ä'.

How to get normal ascii value?

+6
source share
2 answers

GetBytes () returns a negative number (-61, ()) in char 'ä'.

Well getBytes() will use the default encoding for the platform, unless you specify the encoding you should. I would recommend UTF-8 ok. For example, in Java 7:

 byte[] data = text.getBytes(StandardCharsets.UTF_8); 

Java byte is unfortunately signed, but you can think of it as 8 bits. If you want to see an effective unsigned character, simply use:

 int unsigned = someByte & 0xff; 

How to get normal ascii value?

This character does not exist in ASCII. All ASCII characters are in the range U + 0000 to U + 007F.

+12
source
  • Some bytes are negative because byte Java signed, as are int s, short and long s. The easiest way to undo it is to use & 255 - a code example: int fixed_byte = original_byte & 255; .

  • There is no normal ASCII value for ä , because ä not part of ASCII.

  • getBytes does not use ASCII.

  • On your system, getBytes seems to use UTF-8. getBytes does not use the same encoding for all systems. If you specifically want UTF-8, use getBytes(StandardCharsets.UTF_8) .

  • If you look carefully, you will notice that ä actually encoded as two bytes in UTF-8: -61 and -92. After fixing them so that they are not negative, these are 195 and 164.

  • Why use bytes at all? A char can contain any character from the base multilingual plane, including the character ä . (If not for historical errors, char could hold any character back. It's too late to fix it.)

+1
source

All Articles