Is there any way to specify character encoding in java.lang.StringBuilder

Or am I stuck with:

String s = new String(new byte[0], Charset.forName("ISO-8859-1")); // or ISO_8859_1, or LATIN-1 or ... still no constants for those for (String string : strings) { // those are ISO-8959-1 encoded s += string; // hopefully this preserves the encoding (?) } 
+7
java character-encoding
source share
2 answers

Strings are always encoded by UTF-16 in Java. These are just sequences of char values ​​that are UTF-16 code units. When you specify the encoding in the String(byte[], String) constructor String(byte[], String) , it just says how to decode bytes to text - after that, the encoding is discarded.

If you need to keep the encoding, you need to create your own class to save the Charset and String together. I can’t say that I ever wanted to do this - are you really sure what you need?

(Thus, your β€œstuck” code will not work anyway - and it will also be inefficient.)

+13
source share

How to use a cached converter:

 public static void main(String args[]) throws IOException { ByteArrayOutputStream baos = new ByteArrayOutputStream(1<<10); OutputStreamWriter osw = null; try { osw = new OutputStreamWriter(baos, "UTF-8"); } catch (UnsupportedEncodingException ex) { } osw.write("!"); osw.flush(); System.out.println("Hello: " + baos.toString("UTF-8")); } 
+2
source share

All Articles